Abstract | ||
---|---|---|
This article discusses a novel approach developed for static index pruning that takes into account the locality of occurrences of words in the text. We use this new approach to propose and experiment on simple and effective pruning methods that allow a fast construction of the pruned index. The methods proposed here are especially useful for pruning in environments where the document database changes continuously, such as large-scale web search engines. Extensive experiments are presented showing that the proposed methods can achieve high compression rates while maintaining the quality of results for the most common query types present in modern search engines, namely, conjunctive and phrase queries. In the experiments, our locality-based pruning approach allowed reducing search engine indices to 30% of their original size, with almost no reduction in precision at the top answers. Furthermore, we conclude that even an extremely simple locality-based pruning method can be competitive when compared to complex methods that do not rely on locality information. |
Year | DOI | Venue |
---|---|---|
2008 | 10.1145/1344411.1344415 | ACM Trans. Inf. Syst. |
Keywords | Field | DocType |
large-scale web search engine,locality-based pruning method,modern search engine,locality information,locality-based pruning approach,indexing,simple locality-based pruning method,new approach,search engines,static index pruning,novel approach,information retrieval,effective pruning method,additional key words and phrases: pruning,web search,search engine index,pruning,search engine,indexation,web search engine | Data mining,Web search query,Locality,Search engine,Information retrieval,Phrase search,Computer science,Phrase,Search engine indexing,Search analytics,Pruning | Journal |
Volume | Issue | ISSN |
26 | 2 | 1046-8188 |
Citations | PageRank | References |
14 | 0.67 | 28 |
Authors | ||
10 |
Name | Order | Citations | PageRank |
---|---|---|---|
Edleno Silva de Moura | 1 | 988 | 75.44 |
Celia Francisca dos Santos | 2 | 14 | 0.67 |
Bruno Dos santos de Araujo | 3 | 14 | 0.67 |
Altigran Soares da Silva | 4 | 718 | 65.15 |
Pável Calado | 5 | 809 | 55.33 |
Mario A. Nascimento | 6 | 1547 | 162.96 |
MouraEdleno Silva de | 7 | 14 | 0.67 |
SantosCelia Francisca dos | 8 | 14 | 0.67 |
AraujoBruno Dos santos de | 9 | 14 | 0.67 |
SilvaAltigran Soares da | 10 | 14 | 0.67 |