Abstract | ||
---|---|---|
We present a new framework for indexing, locating and ranking XML documents based on content and structural synopses extracted from the documents. Instead of indexing each single element or term in a document, we extract a structural summary and a small number of data synopses from the document, which are indexed in an efficient way suitable for query evaluation. Our query language is XPath extended with full-text search. The result of query evaluation is a ranked list of document locations that best match the query. We propose a novel aggregated ranking scheme, which is integrated into the query evaluation to score the documents based on those data synopses. Our experimental evaluation shows that our indexing scheme outperforms the standard XML indexing scheme based on inverted lists and our ranking scheme is effective in terms of precision and recall. |
Year | DOI | Venue |
---|---|---|
2007 | 10.1007/978-3-540-74469-6_54 | DEXA |
Keywords | Field | DocType |
query language,structure synopses,structural summary,experimental evaluation,document location,query evaluation,ranking xml documents,ranking scheme,ranking xml document,standard xml indexing scheme,data synopsis,indexing scheme,structure synopsis,xml document,indexation | Query optimization,Data mining,Web search query,RDF query language,Query language,Information retrieval,Query expansion,Computer science,Web query classification,Ranking (information retrieval),XPath,Database | Conference |
Volume | ISSN | ISBN |
4653 | 0302-9743 | 3-540-74467-3 |
Citations | PageRank | References |
0 | 0.34 | 8 |
Authors | ||
3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Weimin He | 1 | 2 | 2.75 |
Leonidas Fegaras | 2 | 793 | 158.81 |
David Levine | 3 | 118 | 9.73 |