Title
On Using Query Logs for Static Index Pruning
Abstract
Static index pruning techniques aim at removing from the posting lists of an inverted file the references to documents which are likely to be not relevant for answering user queries. The reduction in the size of the index results in a better exploitation of memory hierarchies and faster query processing. On the other hand, pruning may affect the precision of the information retrieval system, since pruned entries are unavailable at query processing time. Static pruning techniques proposed so far exploit query-independent measures to evaluate the importance of a document within a posting list. This paper proposes a general framework that aims at enhancing the precision of any static pruning methods by exploiting usage information extracted from query logs. Experiments conducted on the TREC WT10g Web collection and a large Altavista query log show that integrating usage knowledge into the pruning process is profitable, and increases remarkably performance figures obtained with the state-of-the art Carmel's static pruning method.
Year
DOI
Venue
2010
10.1109/WI-IAT.2010.139
Web Intelligence
Keywords
Field
DocType
query log,information retrieval system,user query,static index pruning technique,information retrieval,inverted index,static index pruning,inverted file,indexing,static pruning method,large altavista query log,query-independent measures,memory hierarchies,trec wtlog web collection,information extraction,static pruning,query answering,index result,static pruning technique,query logs,altavista query log,pruning process,query processing time,query processing,indexation,profitability
Inverted index,Data mining,Query expansion,Information retrieval,Computer science,Search engine indexing,Exploit,Information extraction,Pruning
Conference
Volume
ISBN
Citations 
1
978-0-7695-4191-4
2
PageRank 
References 
Authors
0.36
7
3
Name
Order
Citations
PageRank
Hoang Thanh Lam11088.49
Raffaele Perego21471108.91
Fabrizio Silvestri31819107.29