Title
Score distributions for Pseudo Relevance Feedback.
Abstract
Relevance-Based Language Models, commonly known as Relevance Models, are successful approaches to explicitly introduce the concept of relevance in the statistical language modelling framework of Information Retrieval. These models achieve state-of-the-art retrieval performance in the Pseudo Relevance Feedback task. It is known that one of the factors that more affect to the Pseudo Relevance Feedback robustness is the selection for some queries of harmful expansion terms. In order to minimise this effect in these methods a crucial point is to reduce the number of non-relevant documents in the pseudo relevant set. In this paper, we propose an original approach to tackle this problem. We try to automatically determine for each query how many documents we should select as pseudo-relevant set. For achieving this objective we will study the score distributions of the initial retrieval and trying to discern in base of their distribution between relevant and non-relevant documents. Evaluation of our proposal showed important improvements in terms of robustness.
Year
DOI
Venue
2014
10.1016/j.ins.2014.03.034
Information Sciences
Keywords
Field
DocType
Information Retrieval,Pseudo Relevance Feedback,Score distribution,Pseudo Relevance Feedback set,Relevance Model
Relevance feedback,Information retrieval,Robustness (computer science),Artificial intelligence,Language modelling,Machine learning,Language model,Mathematics
Journal
Volume
ISSN
Citations 
273
0020-0255
9
PageRank 
References 
Authors
0.49
32
3
Name
Order
Citations
PageRank
Javier Parapar118825.91
Manuel A. Presedo Quindimil290.82
Alvaro Barreiro322622.42