How good is a span of terms?: exploiting proximity to improve web retrieval - Citegraph

Paper Info

Title
How good is a span of terms?: exploiting proximity to improve web retrieval

Abstract
Ranking search results is a fundamental problem in information retrieval. In this paper we explore whether the use of proximity and phrase information can improve web retrieval accuracy. We build on existing research by incorporating novel ranking features based on flexible proximity terms with recent state-of-the-art machine learning ranking models. We introduce a method of determining the goodness of a set of proximity terms that takes advantage of the structured nature of web documents, document metadata, and phrasal information from search engine user query logs. We perform experiments on a large real-world Web data collection and show that using the goodness score of flexible proximity terms can improve ranking accuracy over state-of-the-art ranking methods by as much as 13%. We also show that we can improve accuracy on the hardest queries by as much as 9% relative to state-of-the-art approaches.

Year	DOI	Venue
2010	10.1145/1835449.1835477	SIGIR
Keywords	Field	DocType
proximity term,phrase information,information retrieval,web retrieval,ranking accuracy,state-of-the-art ranking method,ranking search result,phrasal information,flexible proximity term,ranking model,novel ranking,learning to rank,machine learning,search engine,proximity,data collection,bm25	Data mining,Web search query,Learning to rank,Query expansion,Ranking SVM,Information retrieval,Okapi BM25,Computer science,Ranking (information retrieval),Proximity search,Adversarial information retrieval	Conference
Citations	PageRank	References
29	0.81	20
Authors
3

Authors (3 rows)

Cited by (29 rows)

References (20 rows)

Name	Order	Citations	PageRank
Krysta M. Svore	1	826	53.76
Pallika H. Kanani	2	41	1.47
Nazan Khan	3	65	3.06

1