Abstract | ||
---|---|---|
Long queries frequently contain many extraneous terms that hinder retrieval of relevant documents. We present techniques to reduce long queries to more effective shorter ones that lack those extraneous terms. Our work is motivated by the observation that perfectly reducing long TREC description queries can lead to an average improvement of 30% in mean average precision. Our approach involves transforming the reduction problem into a problem of learning to rank all sub-sets of the original query (sub-queries) based on their predicted quality, and selecting the top sub-query. We use various measures of query quality described in the literature as features to represent sub-queries, and train a classifier. Replacing the original long query with the top-ranked sub-query chosen by the ranker results in a statistically significant average improvement of 8% on our test sets. Analysis of the results shows that query reduction is well-suited for moderately-performing long queries, and a small set of query quality predictors are well-suited for the task of ranking sub-queries. |
Year | DOI | Venue |
---|---|---|
2009 | 10.1145/1571941.1572038 | SIGIR |
Keywords | Field | DocType |
average improvement,query reduction,mean average precision,trec description query,original long query,query quality,original query,query quality predictor,extraneous term,long query,learning to rank | Query optimization,Web search query,Data mining,Range query (database),Query expansion,Information retrieval,Computer science,Sargable,Web query classification,Ranking (information retrieval),Boolean conjunctive query | Conference |
Citations | PageRank | References |
111 | 3.29 | 26 |
Authors | ||
2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Giridhar Kumaran | 1 | 478 | 22.23 |
Vitor R. Carvalho | 2 | 672 | 36.38 |