Abstract | ||
---|---|---|
Recent increase in online privacy concerns prompts the following question: can a recommender system be accurate if users do not entrust it with their private data? To answer this, we study the problem of learning item-clusters under local differential privacy, a powerful, formal notion of data privacy. We develop bounds on the sample-complexity of learning itemclusters from privatized user inputs. Significantly, our results identify a sample-complexity separation between learning in an information-rich and an information-scarce regime, thereby highlighting the interaction between privacy and the amount of information (ratings) available to each user. In the information-rich regime, where each user rates at least a constant fraction of items, a spectral clustering approach is shown to achieve a sample-complexity lower bound derived from a simple information-theoretic argument based on Fano’s inequality. However, the information-scarce regime, where each user rates only a vanishing fraction of items, is found to require a fundamentally different approach both for lower bounds and algorithms. To this end, we develop new techniques for bounding mutual information under a notion of channel-mismatch. These techniques may be of broader interest, and we illustrate this by applying them to (i) learning based on 1-bit sketches, and (ii) adaptive learning. Finally, we propose a new algorithm, MaxSense, and show that it achieves optimal sample-complexity in the information-scarce regime. |
Year | DOI | Venue |
---|---|---|
2015 | 10.1109/JSTSP.2015.2423254 | Selected Topics in Signal Processing, IEEE Journal of |
Keywords | Field | DocType |
databases,clustering algorithms,mutual information,recommender systems,data privacy,privacy | Recommender system,Data mining,Mathematical optimization,Differential privacy,Computer science,Theoretical computer science,Mutual information,Information privacy,Cluster analysis,Adaptive learning,Privacy software,Bounding overwatch | Journal |
Volume | Issue | ISSN |
PP | 99 | 1932-4553 |
Citations | PageRank | References |
1 | 0.35 | 17 |
Authors | ||
3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Siddhartha Banerjee | 1 | 185 | 22.85 |
Nidhi Hegde | 2 | 234 | 20.41 |
Laurent Massoulié | 3 | 3512 | 244.42 |