Title
The Price of Privacy in Untrusted Recommendation Engines
Abstract
Recent increase in online privacy concerns prompts the following question: can a recommendation engine be accurate if end-users do not entrust it with their private data? To answer this, we study the problem of predicting user-ratings under local or 'user-end' differential privacy, a powerful, formal notion of data privacy. We develop a systematic approach for lower bounds on the complexity of learning item structure from privatized user inputs, based on mutual information. Our results identify a sample complexity separation between learning in the scarce information regime and the rich information regime, thereby highlighting the role of the amount of ratings (information) available to each user. In the information-rich regime (where each user rates a constant fraction of items), a spectral clustering approach is shown to achieve optimal sample complexity. However, the information-scarce regime (where each user rates only a vanishing fraction of the total item set) is found to require a fundamentally different approach. We propose a new algorithm, MaxSense, and show that it achieves optimal sample complexity in this setting. The techniques we develop for bounding mutual information may be of broader interest. To illustrate this, we show their applicability to (i) learning based on 1-bit sketches (in contrast to differentially private sketches), and (ii) adaptive learning, where queries can be adapted based on answers to past queries.
Year
DOI
Venue
2012
10.1109/Allerton.2012.6483317
2012 50TH ANNUAL ALLERTON CONFERENCE ON COMMUNICATION, CONTROL, AND COMPUTING (ALLERTON)
Keywords
DocType
Volume
data privacy,trusted computing,sampling methods,recommender systems,internet,computational complexity
Conference
abs/1207.3269
ISSN
Citations 
PageRank 
2474-0195
8
0.44
References 
Authors
14
3
Name
Order
Citations
PageRank
Siddhartha Banerjee118522.85
Nidhi Hegde223420.41
Laurent Massoulié33512244.42