Title
Tie Breaker: A Novel Way of Combining Retrieval Signals
Abstract
Empirical studies of information retrieval suggest that the effectiveness of a retrieval function is closely related to how it combines multiple retrieval signals including term frequency, inverse document frequency and document length. Although it is relatively easy to capture how each signal contributes to the relevance scores, it is more challenging to find the best way of combining these signals since they often interact with each other in a complicated way. As a result, when deriving a retrieval function from traditional retrieval models, the choice of one implementation over the others was often made based on empirical observations rather than sound theoretical derivations. In this paper, we propose a novel way of combining retrieval signals to derive robust retrieval functions. Instead of seeking an integrated way of combining these signals into a complex mathematical retrieval function, our main idea is to prioritize the retrieval signals, apply the strongest signal first to rank documents, and then iteratively use the weaker signals to break the ties of the documents with the same scores. One unique advantage of our method is that it eliminates the need of having complicated implementation of the signals and enables a simple yet elegant way of combining the multiple signals for document ranking. Empirical results show that the proposed method can achieve comparable performance as the state of art retrieval functions over traditional TREC ad hoc retrieval collections, and can outperform them over TREC microblog retrieval collections.
Year
DOI
Venue
2013
10.1145/2499178.2499192
ICTIR
Keywords
Field
DocType
retrieval signal,art retrieval function,trec microblog retrieval collection,information retrieval,retrieval collection,robust retrieval function,multiple retrieval,combining retrieval signals,complex mathematical retrieval function,tie breaker,traditional retrieval model,retrieval function,prediction
Divergence-from-randomness model,Data mining,Social media,tf–idf,Information retrieval,Ranking,Empirical evidence,Computer science,Microblogging,Circuit breaker,Term Discrimination,Empirical research
Conference
Citations 
PageRank 
References 
3
0.48
12
Authors
2
Name
Order
Citations
PageRank
Hao Wu1374.58
Hui Fang291863.03