Title
Supporting top-K keyword search in XML databases
Abstract
Keyword search is considered to be an effective information discovery method for both structured and semi- structured data. In XML keyword search, query semantics is based on the concept of Lowest Common Ancestor (LCA). However, naive LCA-based semantics leads to exponential com- putation and result size. In the literature, LCA-based semantic variants (e.g., ELCA and SLCA) were proposed, which define a subset of all the LCAs as the results. While most existing work focuses on algorithmic efficiency, top-K processing for XML keyword search is an important issue that has received very little attention. Existing algorithms focusing on efficiency are designed to optimize the semantic pruning and are incapable of supporting top-K processing. On the other hand, straightforward applications of top-K techniques from other areas (e.g., relational databases) generate LCAs that may not be the results and unnecessarily expand efforts in the semantic pruning. In this paper, we propose a series of join-based algorithms that combine the semantic pruning and the top-K processing to support top-K keyword search in XML databases. The algorithms essentially reduce the keyword query evaluation to relational joins, and incorporate the idea of the top-K join from relational databases. Extensive experimental evaluations show the performance advan- tages of our algorithms.
Year
DOI
Venue
2010
10.1109/ICDE.2010.5447818
Data Engineering
Keywords
Field
DocType
XML,data mining,query formulation,relational databases,LCA based semantic variant,Lowest Common Ancestor,Top-K keyword search,XML databases,information discovery method,join based algorithm,keyword query evaluation,query semantics,relational databases,semantic pruning
Data mining,Joins,Query language,Lowest common ancestor,XML,Information retrieval,Relational database,Computer science,XML database,Database,Semantics,Information discovery
Conference
ISSN
ISBN
Citations 
1084-4627
978-1-4244-5444-0
50
PageRank 
References 
Authors
1.39
27
2
Name
Order
Citations
PageRank
Liang Jeff Chen1562.22
Yannis Papakonstantinou25657837.56