Abstract | ||
---|---|---|
Statistical analysis allows user traces to be matched with prior behavior so as to identify the user and hence compromise their privacy. There are two commonly used techniques to protect user identities: (1) anonymization, where identities are permuted periodically to prevent statistical analysis of long time series; (2) obfuscation, where user traces are obscured by noise to obtain privacy. We explore privacy when user traces are independent and identically distributed (i.i.d.) Gaussian series; i.e., for each user, we observe a time series with the data sample at each time instant drawn from an i.i.d. Gaussian distribution with a user-dependent mean. We consider both anonymization and obfuscation techniques, and study how the two techniques impact the level of privacy. We provide: (1) an exact expression for the error probability of identifying the users when the number of users is finite; (2) an asymptotic analysis of how user privacy varies with different degrees of anonymization and obfuscation as the number of users grows large. We show that there exist thresholds for the two techniques that separate the regions of user privacy: above either of the thresholds, not all users lose privacy; below both of the thresholds, users have no privacy. |
Year | DOI | Venue |
---|---|---|
2018 | 10.1109/CISS.2018.8362258 | 2018 52nd Annual Conference on Information Sciences and Systems (CISS) |
Keywords | Field | DocType |
Privacy,Statistical Matching,Anonymization and Obfuscation | Data mining,Time series,Random variable,Sample (statistics),Computer science,Computer network,Gaussian,Independent and identically distributed random variables,Obfuscation,Probability of error,Asymptotic analysis | Conference |
ISBN | Citations | PageRank |
978-1-5386-0580-6 | 0 | 0.34 |
References | Authors | |
0 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Ke Li | 1 | 50 | 26.41 |
Hossein Pishro-Nik | 2 | 429 | 45.84 |
Dennis Goeckel | 3 | 1060 | 69.96 |