Title
A Heterogeneous Field Matching Method for Record Linkage
Abstract
Record linkage is the process of determining that two records refer to the same entity. A key subprocess is evaluating how well the individual fields, or attributes, of the records match each other. One approach to matching fields is to use hand-written domain-specific rules. This "expert systems" approach may result in good performance for specific applications, but it is not scalable. This paper describes a new machine learning approach that creates expert-like rules for field matching. In our approach, the relationship between two field values is described by a set of heterogeneous transformations. Previous machine learning methods used simple models to evaluate the distance between two fields. However, our approach enables more sophisticated relationships to be modeled, which better capture the complex domain specific, common-sense phenomena that humans use to judge similarity. We compare our approach to methods that rely on simpler homogeneous models in several domains. By modeling more complex relationships we produce more accurate results.
Year
DOI
Venue
2005
10.1109/ICDM.2005.7
ICDM
Keywords
Field
DocType
learning artificial intelligence,record linkage,machine learning,database management systems,pattern matching
Record linkage,Online machine learning,Data mining,Active learning (machine learning),Homogeneous,Computer science,Expert system,Artificial intelligence,Pattern matching,Machine learning,Scalability
Conference
ISSN
ISBN
Citations 
1550-4786
0-7695-2278-5
23
PageRank 
References 
Authors
1.37
10
5
Name
Order
Citations
PageRank
Steven Minton13473536.74
Claude Nanjo2242.45
Craig A. Knoblock35229680.57
Martin Michalowski415515.03
Matthew Michelson540922.23