Title
Locating and Ranking XML Documents Based on Content and Structure Synopses
Abstract
We present a new framework for indexing, locating and ranking XML documents based on content and structural synopses extracted from the documents. Instead of indexing each single element or term in a document, we extract a structural summary and a small number of data synopses from the document, which are indexed in an efficient way suitable for query evaluation. Our query language is XPath extended with full-text search. The result of query evaluation is a ranked list of document locations that best match the query. We propose a novel aggregated ranking scheme, which is integrated into the query evaluation to score the documents based on those data synopses. Our experimental evaluation shows that our indexing scheme outperforms the standard XML indexing scheme based on inverted lists and our ranking scheme is effective in terms of precision and recall.
Year
DOI
Venue
2007
10.1007/978-3-540-74469-6_54
DEXA
Keywords
Field
DocType
query language,structure synopses,structural summary,experimental evaluation,document location,query evaluation,ranking xml documents,ranking scheme,ranking xml document,standard xml indexing scheme,data synopsis,indexing scheme,structure synopsis,xml document,indexation
Query optimization,Data mining,Web search query,RDF query language,Query language,Information retrieval,Query expansion,Computer science,Web query classification,Ranking (information retrieval),XPath,Database
Conference
Volume
ISSN
ISBN
4653
0302-9743
3-540-74467-3
Citations 
PageRank 
References 
0
0.34
8
Authors
3
Name
Order
Citations
PageRank
Weimin He122.75
Leonidas Fegaras2793158.81
David Levine31189.73