Title
Comparing inverted files and signature files for searching a large lexicon
Abstract
Signature files and inverted files are well-known index structures. In this paper we undertake a direct comparision of the two for searching for partially-specified queries in a large lexicon stored in main memory. Using n-grams to index lexicon terms, a bit-sliced signature file can be compressed to a smaller size than an inverted file if each n-gram sets only one bit in the term signature. With a signature width less than half the number of unique n-grams in the lexicon, the signature file method is about as fast as the inverted file method, and significantly smaller. Greater flexibility in memory usage and faster index generation time make signature files appropriate for searching large lexicons or other collections in an environment where memory is at a premium.
Year
DOI
Venue
2005
10.1016/j.ipm.2003.12.003
Inf. Process. Manage.
Keywords
Field
DocType
dictionaries,signature file,signature width,performance evaluation,term signature,performance evaluation.,signature file method,personal digital assistants pdas,compression,index lexicon term,indexing methods,inverted file method,large lexicon,faster index generation time,personal digital assistants,inverted file,bit-sliced signature file,indexation,generation time
Inverted index,Data mining,Indexation,Information retrieval,Computer science,Search engine indexing,Lexicon,Lexico,Signature file,Statistical analysis
Journal
Volume
Issue
ISSN
41
3
Information Processing and Management
Citations 
PageRank 
References 
4
0.37
36
Authors
2
Name
Order
Citations
PageRank
Ben Carterette1154483.86
Fazli Can258194.63