Title
Succinct Text Indexing with Wildcards
Abstract
A succinct text index uses space proportional to the text itself, say, two times n log*** for a text of n characters over an alphabet of size *** . In the past few years, there were several exciting results leading to succinct indexes that support efficient pattern matching. In this paper we present the first succinct index for a text that contains wildcards. The space complexity of our index is (3 + o (1))n log*** + O (***logn ) bits, where *** is the number of wildcard groups in the text. Such an index finds applications in indexing genomic sequences that contain single-nucleotide polymorphisms (SNP), which could be modeled as wildcards. In the course of deriving the above result, we also obtain an alternate succinct index of a set of d patterns for the purpose of dictionary matching. When compared with the succinct index in the literature, the new index doubles the size (precisely, from n log*** to 2 n log*** , where n is the total length of all patterns), yet it reduces the matching time to O (m log*** + m logd + occ ), where m is the length of the query text. It is worth-mentioning that the time complexity no longer depends on the total dictionary size.
Year
DOI
Venue
2009
10.1007/978-3-642-03784-9_5
SPIRE
Keywords
Field
DocType
new index,succinct index,m log,n character,n log,dictionary matching,succinct text index,succinct text indexing,times n log,query text,alternate succinct index,genome sequence,indexation,time complexity,single nucleotide polymorphism,pattern matching,space complexity
Wildcard,Combinatorics,Wildcard character,Code segment,Search engine indexing,Time complexity,Pattern matching,Mathematics,Alphabet
Conference
Volume
ISSN
Citations 
5721
0302-9743
16
PageRank 
References 
Authors
0.68
15
4
Name
Order
Citations
PageRank
Alan Tam1602.71
Edward Wu22208.81
Tak-Wah Lam31860164.96
Siu-ming Yiu4102692.90