Abstract | ||
---|---|---|
Most of the existing information retrieval models assume that the terms of a text document are independent of each other. These retrieval models integrate three major variables to determine the degree of importance of a term for a document: within document term frequency, document length and the specificity of the term in the collection. Intuitively, the importance of a term for a document is not only dependent on the three aspects mentioned above, but also dependent on the degree of semantic coherence between the term and the document. In this paper, we propose a heuristic approach, in which the degree of semantic coherence of the query terms with a document is adopted to improve the information retrieval performance. Experimental results on standard TREC collections show the proposed models consistently outperform the state-of-the-art models. |
Year | DOI | Venue |
---|---|---|
2016 | 10.1145/2911451.2914691 | SIGIR |
Keywords | Field | DocType |
Document ranking,Retrieval model,Term weighting | Data mining,Document clustering,Computer science,Explicit semantic analysis,Artificial intelligence,Natural language processing,Term Discrimination,Vector space model,Document retrieval,Heuristic,tf–idf,Information retrieval,Coherence (physics) | Conference |
Citations | PageRank | References |
2 | 0.35 | 12 |
Authors | ||
4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Xinhui Tu | 1 | 5 | 2.24 |
Xiangji Huang | 2 | 1551 | 159.34 |
Jing Luo | 3 | 8 | 1.14 |
Tingting He | 4 | 348 | 61.04 |