Title
Publishing Search Logs—A Comparative Study of Privacy Guarantees
Abstract
Search engine companies collect the “database of intentions,” the histories of their users' search queries. These search logs are a gold mine for researchers. Search engine companies, however, are wary of publishing search logs in order not to disclose sensitive information. In this paper, we analyze algorithms for publishing frequent keywords, queries, and clicks of a search log. We first show how methods that achieve variants of k-anonymity are vulnerable to active attacks. We then demonstrate that the stronger guarantee ensured by ε-differential privacy unfortunately does not provide any utility for this problem. We then propose an algorithm ZEALOUS and show how to set its parameters to achieve (ε,δ )-probabilistic privacy. We also contrast our analysis of ZEALOUS with an analysis by Korolova et al. [17] that achieves (ε′,δ′)-indistinguishability. Our paper concludes with a large experimental study using real applications where we compare ZEALOUS and previous work that achieves k-anonymity in search log publishing. Our results show that ZEALOUS yields comparable utility to k-anonymity while at the same time achieving much stronger privacy guarantees.
Year
DOI
Venue
2012
10.1109/TKDE.2011.26
IEEE Trans. Knowl. Data Eng.
Keywords
Field
DocType
stronger privacy guarantee,search log,comparative study,zealous yield,search log publishing,search query,comparable utility,differential privacy,search engine company,algorithm zealous,probabilistic privacy,privacy guarantees,publishing search logs,probabilistic logic,information technology,indexation,publishing,system security,data privacy,general,histograms,indexes,search engine,database management,privacy,search engines,history,security
Histogram,Data mining,Search engine,Information retrieval,Computer science,Probabilistic logic,Publishing,Information sensitivity,Information privacy,Database
Journal
Volume
Issue
ISSN
24
3
1041-4347
Citations 
PageRank 
References 
39
1.58
19
Authors
5
Name
Order
Citations
PageRank
Michaela Götz124610.62
Ashwin Machanavajjhala22624132.52
Guozhang Wang340317.55
Xiaokui Xiao43266142.32
Johannes Gehrke5133621055.06