Title
Record linkage performance for large data sets
Abstract
We propose new data structures to speed up Record Linkage that take advantage of the value distribution of usual string attributes, like name or surname. Using some additional memory, we increase the processing speed by almost an order of magnitude without losing recall or precision at all. The improvement achieved is independent from the methods used for reducing the number of record comparisons, like Blocking or Sliding Window, and the specific string comparison functions.
Year
DOI
Venue
2009
10.1145/1651449.1651453
CIKM-PAVLAD
Keywords
DocType
Citations 
sliding window,large data set,processing speed,record linkage performance,specific string comparison function,new data structure,usual string attribute,record comparison,value distribution,record linkage,additional memory,memoization,deduplication,data structure
Conference
1
PageRank 
References 
Authors
0.35
13
3
Name
Order
Citations
PageRank
Jordi Gómez-Bao110.35
Josep-L. Larriba-Pey216217.44
Josepa Ribes Puig310.35