Keyword Search With Real-Time Entity Resolution In Relational Databases - Citegraph

Paper Info

Title
Keyword Search With Real-Time Entity Resolution In Relational Databases

Abstract
Traditional methods of IR-style keyword search/query in relational databases are based on clean data without entity resolution (ER), and as a result, their answers to a query may contain duplicates for dirty datasets with duplicate tuples that have different identifiers and refer to the same real-world entity. In this paper, we propose a method for processing top-N keyword queries with real-time ER. This method creates an index to obtain candidate tuples for a keyword query, defines a function to compute the similarities between the query and its candidate tuples, and designs a clustering algorithm with the Divide and Conquer mechanism to deduplicate the query results. Extensive experiments are conducted to confirm the effectiveness and efficiency of the method for both dirty and (almost) clean datasets.

Year	DOI	Venue
2018	10.1145/3195106.3195171	PROCEEDINGS OF 2018 10TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND COMPUTING (ICMLC 2018)
Keywords	Field	DocType
Entity resolution, relational database, similarity, top-N keyword query	Name resolution,Identifier,Information retrieval,Relational database,Computer science,Tuple,Keyword search,Artificial intelligence,Divide and conquer algorithms,Cluster analysis,Machine learning	Conference
Citations	PageRank	References
0	0.34	0
Authors
5

Authors (5 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Liang Zhu	1	4	3.13
Xu Du	2	37	15.92
Qin Ma	3	2	0.74
W. Meng	4	54	249.10
Haibo Liu	5	0	0.34

1