Title
An algorithmic treatment of strong queries
Abstract
A strong query for a target document with respect to an index is the smallest query for which the target document is returned by the index as the top result for the query. The strong query problem was first studied more than a decade ago in the context of measuring search engine overlap. Despite its simple-to-state nature and its longevity in the field, this problem has not been sufficiently addressed in a formal manner. In this paper we provide the first rigorous treatment of the strong query problem. We show an interesting connection between this problem and the set cover problem, and use it to obtain basic hardness and algorithmic results. Experiments on more than 10K documents show that our proposed algorithm performs much better than the widely-used word frequency-based heuristic. En route, our study suggests that less than four words on average can be sufficient to uniquely identify web pages.
Year
DOI
Venue
2011
10.1145/1935826.1935928
WSDM
Keywords
Field
DocType
algorithmic result,strong query,frequency-based heuristic,target document,formal manner,basic hardness,smallest query,algorithmic treatment,set cover problem,interesting connection,strong query problem,indexation,greedy algorithm,set covering problem,search engine,web pages,set cover,word frequency
Query optimization,Data mining,Web search query,Range query (database),Query language,Information retrieval,Query expansion,Computer science,Sargable,Web query classification,Theoretical computer science,Ranking (information retrieval)
Conference
Citations 
PageRank 
References 
0
0.34
17
Authors
3
Name
Order
Citations
PageRank
Ravi Kumar1139321642.48
Silvio Lattanzi272046.77
Prabhakar Raghavan3133512776.61