Title
Keyword Search With Real-Time Entity Resolution In Relational Databases
Abstract
Traditional methods of IR-style keyword search/query in relational databases are based on clean data without entity resolution (ER), and as a result, their answers to a query may contain duplicates for dirty datasets with duplicate tuples that have different identifiers and refer to the same real-world entity. In this paper, we propose a method for processing top-N keyword queries with real-time ER. This method creates an index to obtain candidate tuples for a keyword query, defines a function to compute the similarities between the query and its candidate tuples, and designs a clustering algorithm with the Divide and Conquer mechanism to deduplicate the query results. Extensive experiments are conducted to confirm the effectiveness and efficiency of the method for both dirty and (almost) clean datasets.
Year
DOI
Venue
2018
10.1145/3195106.3195171
PROCEEDINGS OF 2018 10TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND COMPUTING (ICMLC 2018)
Keywords
Field
DocType
Entity resolution, relational database, similarity, top-N keyword query
Name resolution,Identifier,Information retrieval,Relational database,Computer science,Tuple,Keyword search,Artificial intelligence,Divide and conquer algorithms,Cluster analysis,Machine learning
Conference
Citations 
PageRank 
References 
0
0.34
0
Authors
5
Name
Order
Citations
PageRank
Liang Zhu143.13
Xu Du23715.92
Qin Ma320.74
W. Meng454249.10
Haibo Liu500.34