Abstract | ||
---|---|---|
In this paper, we propose a new merge-based index maintenance strategy for Information Retrieval systems. The new model is based on partitioning of the inverted index across the terms in it. We exploit the query log to partition the on-disk inverted index into two types of sub-indexes. Inverted lists of the terms contained in the queries that are frequently posed to the Information Retrieval systems are kept in one partition, called frequent-term index and the other inverted lists form another partition, called infrequent-term index. We use a lazy-merge strategy for maintaining infrequent-term sub-indexes, and an active merge strategy for maintaining frequent-term sub-indexes. The sub-indexes are also similarly split into frequent and in-frequent parts. Experimental results show that the proposed method improves both index maintenance performance and query performance compared to the existing merge-based strategies. |
Year | DOI | Venue |
---|---|---|
2009 | 10.1145/1645953.1646010 | CIKM |
Keywords | Field | DocType |
on-line index maintenance,frequent-term index,frequent-term sub-indexes,inverted list,on-disk inverted index,new merge-based index maintenance,information retrieval system,infrequent-term sub-indexes,index maintenance performance,infrequent-term index,horizontal partitioning,inverted index,indexation,inverted file,search engine | Inverted index,Data mining,Maintenance strategy,Search engine,Information retrieval,Computer science,Exploit,Merge (version control),Partition (number theory) | Conference |
Citations | PageRank | References |
10 | 0.52 | 14 |
Authors | ||
2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Sairam Gurajada | 1 | 118 | 7.83 |
P. Sreenivasa Kumar | 2 | 227 | 29.64 |