Title
Scalable Privacy-Preserving Record Linkage for Multiple Databases
Abstract
Privacy-preserving record linkage (PPRL) is the process of identifying records that correspond to the same real-world entities across several databases without revealing any sensitive information about these entities. Various techniques have been developed to tackle the problem of PPRL, with the majority of them only considering linking two databases. However, in many real-world applications data from more than two sources need to be linked. In this paper we consider the problem of linking data from three or more sources in an efficient and secure way. We propose a protocol that combines the use of Bloom filters, secure summation, and Dice coefficient similarity calculation with the aim to identify all records held by the different data sources that have a similarity above a certain threshold. Our protocol is secure in that no party learns any sensitive information about the other parties' data, but all parties learn which of their records have a high similarity with records held by the other parties. We evaluate our protocol on a large dataset showing the scalability, linkage quality, and privacy of our protocol.
Year
DOI
Venue
2014
10.1145/2661829.2661875
CIKM
Keywords
Field
DocType
security,multi-party,privacy,bloom filter,security, integrity, and protection,record linkage
Bloom filter,Data mining,Record linkage,Information retrieval,Sørensen–Dice coefficient,Computer science,Information sensitivity,Database,Scalability
Conference
Citations 
PageRank 
References 
10
0.59
11
Authors
2
Name
Order
Citations
PageRank
Dinusha Vatsalan1100.92
Peter Christen21697107.21