Title
Learning-Based Methods with Human-in-the-Loop for Entity Resolution
Abstract
This tutorial is intended for researchers and practitioners working in the data integration area and, in particular, entity resolution (ER), which is a sub-area focused on linking entities across heterogeneous datasets. We outline the ideal requirements of modern ER systems: (1) capture domain knowledge via (minimal) human interaction, (2) provide as much automation as possible via machine learning techniques, and (3) achieve high explainability. We describe recent research trends towards bringing such ideal ER systems closer to reality. We begin with an overview of human-in-the-loop methods that are based on techniques such as crowdsourcing and active learning. We then dive into recent trends that involve deep learning techniques such as representation learning to automate feature engineering, and combinations of transfer and active learning to reduce the amount of user labels required. We also discuss how explainable AI relates to ER, and outline some of the recent advances towards explainable ER.
Year
DOI
Venue
2019
10.1145/3357384.3360316
Proceedings of the 28th ACM International Conference on Information and Knowledge Management
Keywords
Field
DocType
entity resolution, human-in-the-loop, machine learning
Data mining,Name resolution,Information retrieval,Computer science,Human-in-the-loop
Conference
ISBN
Citations 
PageRank 
978-1-4503-6976-3
0
0.34
References 
Authors
0
4
Name
Order
Citations
PageRank
Sairam Gurajada11187.83
Ling-ling Yan2127370.78
Kun Qian382.81
Prithviraj Sen483738.24