Stochastic Query Covering for Fast Approximate Document Retrieval - Citegraph

Paper Info

Title
Stochastic Query Covering for Fast Approximate Document Retrieval

Abstract
We design algorithms that, given a collection of documents and a distribution over user queries, return a small subset of the document collection in such a way that we can efficiently provide high-quality answers to user queries using only the selected subset. This approach has applications when space is a constraint or when the query-processing time increases significantly with the size of the collection. We study our algorithms through the lens of stochastic analysis and prove that even though they use only a small fraction of the entire collection, they can provide answers to most user queries, achieving a performance close to the optimal. To complement our theoretical findings, we experimentally show the versatility of our approach by considering two important cases in the context of Web search. In the first case, we favor the retrieval of documents that are relevant to the query, whereas in the second case we aim for document diversification. Both the theoretical and the experimental analysis provide strong evidence of the potential value of query covering in diverse application scenarios.

Year	DOI	Venue
2015	10.1145/2699671	ACM Trans. Inf. Syst.
Keywords	Field	DocType
algorithms,query covering,measurement,search process,caching,stochastic analysis	Data mining,Query expansion,Information retrieval,Computer science,Web query classification,Stochastic process,Ranking (information retrieval),Through-the-lens metering,Diversification (marketing strategy),Document retrieval	Journal
Volume	Issue	ISSN
33	3	1046-8188
Citations	PageRank	References
5	0.41	47
Authors
6

Authors (6 rows)

Cited by (5 rows)

References (47 rows)

Name	Order	Citations	PageRank
Aris Anagnostopoulos	1	1054	67.08
Luca Becchetti	2	945	55.75
Ilaria Bordino	3	629	28.81
Stefano Leonardi	4	411	28.87
Ida Mele	5	73	10.95
Piotr Sankowski	6	596	47.95

1