Title
How good is a span of terms?: exploiting proximity to improve web retrieval
Abstract
Ranking search results is a fundamental problem in information retrieval. In this paper we explore whether the use of proximity and phrase information can improve web retrieval accuracy. We build on existing research by incorporating novel ranking features based on flexible proximity terms with recent state-of-the-art machine learning ranking models. We introduce a method of determining the goodness of a set of proximity terms that takes advantage of the structured nature of web documents, document metadata, and phrasal information from search engine user query logs. We perform experiments on a large real-world Web data collection and show that using the goodness score of flexible proximity terms can improve ranking accuracy over state-of-the-art ranking methods by as much as 13%. We also show that we can improve accuracy on the hardest queries by as much as 9% relative to state-of-the-art approaches.
Year
DOI
Venue
2010
10.1145/1835449.1835477
SIGIR
Keywords
Field
DocType
proximity term,phrase information,information retrieval,web retrieval,ranking accuracy,state-of-the-art ranking method,ranking search result,phrasal information,flexible proximity term,ranking model,novel ranking,learning to rank,machine learning,search engine,proximity,data collection,bm25
Data mining,Web search query,Learning to rank,Query expansion,Ranking SVM,Information retrieval,Okapi BM25,Computer science,Ranking (information retrieval),Proximity search,Adversarial information retrieval
Conference
Citations 
PageRank 
References 
29
0.81
20
Authors
3
Name
Order
Citations
PageRank
Krysta M. Svore182653.76
Pallika H. Kanani2411.47
Nazan Khan3653.06