Title
Bootstrapping Privacy Compliance in Big Data Systems
Abstract
With the rapid increase in cloud services collecting and using user data to offer personalized experiences, ensuring that these services comply with their privacy policies has become a business imperative for building user trust. However, most compliance efforts in industry today rely on manual review processes and audits designed to safeguard user data, and therefore are resource intensive and lack coverage. In this paper, we present our experience building and operating a system to automate privacy policy compliance checking in Bing. Central to the design of the system are (a) Legal ease-a language that allows specification of privacy policies that impose restrictions on how user data is handled, and (b) Grok-a data inventory for Map-Reduce-like big data systems that tracks how user data flows among programs. Grok maps code-level schema elements to data types in Legal ease, in essence, annotating existing programs with information flow types with minimal human input. Compliance checking is thus reduced to information flow analysis of big data systems. The system, bootstrapped by a small team, checks compliance daily of millions of lines of ever-changing source code written by several thousand developers.
Year
DOI
Venue
2014
10.1109/SP.2014.28
IEEE Symposium on Security and Privacy
Keywords
Field
DocType
bing,user data handling,privacy, compliance, program analysis, bing, information flow, policy, big data,datatypes,information flow,big data,data privacy,compliance,parallel programming,program annotation,conformance testing,personalized user experiences,source code (software),information flow types,cloud services,business imperative privacy policies,privacy,computer bootstrapping,grok data inventory,code-level schema element mapping,legal ease language,program analysis,minimal human input,user trust,map-reduce-like big data systems,cloud computing,source code,search engines,privacy compliance bootstrapping,web services,policy,privacy policy specification,automatic privacy policy compliance checking,lattices,advertising,semantics
Information flow (information theory),Internet privacy,Privacy by Design,Computer security,Computer science,Source code,Privacy policy,Data type,Information privacy,Big data,Cloud computing
Conference
ISSN
Citations 
PageRank 
1081-6011
17
0.70
References 
Authors
25
6
Name
Order
Citations
PageRank
Shayak Sen1968.89
Saikat Guha21546116.91
Anupam Datta3161787.21
Sriram K. Rajamani43386246.27
Janice Y. Tsai5170.70
Jeannette M. Wing66429874.60