Title
More Hybrid and Secure Protection of Statistical Data Sets
Abstract
Different methods and paradigms to protect data sets containing sensitive statistical information have been proposed and studied. The idea is to publish a perturbed version of the data set that does not leak confidential information, but that still allows users to obtain meaningful statistical values about the original data. The two main paradigms for data set protection are the classical one and the synthetic one. Recently, the possibility of combining the two paradigms, leading to a hybrid paradigm, has been considered. In this work, we first analyze the security of some synthetic and (partially) hybrid methods that have been proposed in the last years, and we conclude that they suffer from a high interval disclosure risk. We then propose the first fully hybrid SDC methods; unfortunately, they also suffer from a quite high interval disclosure risk. To mitigate this, we propose a postprocessing technique that can be applied to any data set protected with a synthetic method, with the goal of reducing its interval disclosure risk. We describe through the paper a set of experiments performed on reference data sets that support our claims.
Year
DOI
Venue
2012
10.1109/TDSC.2012.40
IEEE Trans. Dependable Sec. Comput.
Keywords
Field
DocType
data privacy,statistical analysis,classical paradigm,confidential information,data set protection,fully hybrid SDC method,high interval disclosure risk,hybrid paradigm,postprocessing technique,sensitive statistical information,statistical data sets,synthetic method,synthetic paradigm,Statistical data sets protection,hybrid methods,interval disclosure risk.,synthetic methods
Reference data (financial markets),Publication,Data mining,Data set,Confidentiality,Computer science,Information privacy,Statistical analysis
Journal
Volume
Issue
ISSN
9
5
1545-5971
Citations 
PageRank 
References 
2
0.36
11
Authors
3
Name
Order
Citations
PageRank
Javier Herranz11078.73
Jordi Nin231126.53
Marc Sole3161.68