Title
Kd-trees and the real disclosure risks of large statistical databases
Abstract
Estimating the disclosure risk of a Statistical Disclosure Control (SDC) protection method by means of (distance-based) record linkage techniques is a very popular approach to analyze the privacy level offered by such a method. When databases are very large, some particular record linkage techniques such as blocking or partitioning are usually applied to make this process reasonably efficient. However, in this case the record linkage process is not exact, which means that the disclosure risk of a SDC protection method may be underestimated. In this paper we propose the use of kd-trees techniques to apply exact yet very efficient record linkage when (protected) datasets are very large. We describe some experiments showing that this approach achieves better results, in terms of both accuracy and running time, than more classical approaches such as record linkage based on a sliding window. We also discuss and experiment on the use of these techniques not to link a whole protected record with its original one, but just to guess the value of some confidential attribute(s) of the record(s). This fact leads to concepts such as k-neighbor l-diversity or k-neighbor p-sensitivity, a generalization (to any SDC protection method) of l-diversity or p-sensitivity, which have been defined for SDC protection methods ensuring k-anonymity, such as microaggregation.
Year
DOI
Venue
2012
10.1016/j.inffus.2011.03.001
Information Fusion
Keywords
DocType
Volume
record linkage,whole protected record,disclosure risk,real disclosure risk,record linkage process,large statistical databases,statistical disclosure control,classical approach,sdc protection method,record linkage technique,protection method,efficient record linkage,attribute disclosure,particular record linkage technique,kd-trees,kd trees
Journal
13
Issue
ISSN
Citations 
4
Information Fusion
8
PageRank 
References 
Authors
0.45
18
3
Name
Order
Citations
PageRank
Javier Herranz162831.52
Jordi Nin231126.53
Marc Solé310912.17