Title
Learning object identification rules for information integration
Abstract
When integrating information from multiple websites, the same data objects can exist in inconsistent text formats across sites, making it difficult to identify matching objects using exact text match. We have developed an object identification system called Active Atlas, which compares the objects’ shared attributes in order to identify matching objects. Certain attributes are more important for deciding if a mapping should exist between two objects. Previous methods of object identification have required manual construction of object identification rules or mapping rules for determining the mappings between objects. This manual process is time consuming and error-prone. In our approach. Active Atlas learns to tailor mapping rules, through limited user input, to a specific application domain. The experimental results demonstrate that we achieve higher accuracy and require less user involvement than previous methods across various application domains.
Year
DOI
Venue
2001
10.1016/S0306-4379(01)00042-4
Inf. Syst.
Keywords
DocType
Volume
active learning,record linkage,information integration,data cleaning,object identification,object identification rule,machine learning
Journal
26
Issue
ISSN
Citations 
8
Information Systems
150
PageRank 
References 
Authors
13.68
63
3
Search Limit
100150
Name
Order
Citations
PageRank
Sheila Tejada170485.55
Craig A. Knoblock25229680.57
Steven Minton33473536.74