Title
High Performance Index Build Algorithms for Intranet Search Engines
Abstract
There has been a substantial amount of research on high-performance algorithms for constructing an inverted text index. However, constructing the inverted index in a intranet search engine is only the final step in a more complicated index build process. Among other things, this process requires an analysis of all the data being indexed to compute measures like PageRank. The time to perform this global analysis step is significant compared to the time to construct the inverted index, yet it has not received much attention in the research literature. In this paper, we describe how the use of slightly outdated information from global analysis and a fast index construction algorithm based on radix sorting can be combined in a novel way to significantly speed up the index build process without sacrificing search quality.
Year
DOI
Venue
2004
10.1016/B978-012088469-8.50101-7
VLDB
Keywords
Field
DocType
final step,high performance index,intranet search engine,search quality,global analysis step,global analysis,fast index construction algorithm,research literature,complicated index,inverted text index,inverted index,indexation,search engine
Inverted index,Data mining,PageRank,Search engine,Performance index,Computer science,Intranet,Radix sort,Algorithm,Database,Speedup
Conference
ISSN
ISBN
Citations 
Proceedings 2004 VLDB Conference
0-12-088469-0
13
PageRank 
References 
Authors
1.13
19
5
Name
Order
Citations
PageRank
Marcus Fontoura1111661.74
Eugene J. Shekita23630574.21
Jason Y. Zien358966.01
Sridhar Rajagopalan445271036.34
Andreas Neumann5131.13