Abstract | ||
---|---|---|
Implicit query systems examine a document and automatically conduct searches for the most relevant information. In this paper, we offer three contributions to implicit query research. First, we show how to use query logs from a search engine: by constraining results to commonly issued queries, we can get dramatic improvements. Second, we describe a method for optimizing parameters for an implicit query system, by using logistic regression training. The method is designed to estimate the probability that any particular suggested query is a good one. Third, we show which features beyond standard TF-IDF features are most helpful in our logistic regression model: query frequency information, capitalization information, subject line information, and message length information. Using the optimization method and the additional features, we are able to produce a system with up to 6 times better results on top-1 score than a simple TF-IDF system. |
Year | Venue | Keywords |
---|---|---|
2005 | CEAS | logistic regression,logistic regression model,search engine |
Field | DocType | Citations |
Data mining,Search engine,Computer science,Logistic model tree,Message length,Logistic regression | Conference | 15 |
PageRank | References | Authors |
1.44 | 8 | 2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Joshua Goodman | 1 | 1079 | 146.02 |
Vitor R. Carvalho | 2 | 672 | 36.38 |