Abstract | ||
---|---|---|
This paper addresses the integration of XML tags into a term-weighting function for focused XML information retrieval (IR). Our model allows us to consider a certain kind of structural information: tags that represent a logical structure (e.g., title, section, paragraph, etc.) as well as other tags (e.g., bold, italic, center, etc.). We take into account the influence of a tag by estimating the probability for this tag to distinguish relevant terms from the others. Then, these weights are integrated in a term-weighting function. Experiments on a large collection from the INEX 2008 XML IR evaluation campaign showed improvements on focused XML retrieval. |
Year | DOI | Venue |
---|---|---|
2012 | 10.1007/s10115-011-0426-0 | Knowl. Inf. Syst. |
Keywords | Field | DocType |
focused information retrieval,relevant term,bm25 extension,structural information,certain kind,xml tag,xml retrieval,large collection,logical structure,xml information retrieval,xml ir evaluation campaign,term-weighting function,bm25,xml | Divergence-from-randomness model,Data mining,XML framework,Human–computer information retrieval,Information retrieval,XML,XML validation,Computer science,Document Structure Description,XML database,XML schema | Journal |
Volume | Issue | ISSN |
32 | 1 | 0219-3116 |
Citations | PageRank | References |
11 | 0.61 | 33 |
Authors | ||
2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Mathias Géry | 1 | 137 | 37.23 |
Christine Largeron | 2 | 148 | 30.40 |