Abstract | ||
---|---|---|
To store and search genomic databases efficiently, researchers have recently started building compressed self-indexes based on straight-line programs and LZ77. In this paper we show how, given a balanced straight-line program for a string S[1..n] whose LZ77 parse consists of z phrases, we can add O(z log log z) words and obtain a compressed self-index for S such that, given a pattern P [1..m], we can list the occ occurrences of P in S in O(m2 + (m + occ) log log n) time. All previous self-indexes are either larger or slower in the worst case. |
Year | DOI | Venue |
---|---|---|
2012 | 10.1007/978-3-642-28332-1_21 | LATA |
Keywords | Field | DocType |
log log n,grammar-based self-index,straight-line program,balanced straight-line program,genomic databases,occ occurrence,worst case,previous self-indexes,lz77 parse,log log,pattern p,data structure,indexation | Log-log plot,Discrete mathematics,Computer science,Grammar,Parsing | Conference |
Citations | PageRank | References |
43 | 1.22 | 25 |
Authors | ||
4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Travis Gagie | 1 | 643 | 63.61 |
Paweł Gawrychowski | 2 | 116 | 6.33 |
Juha Kärkkäinen | 3 | 1354 | 95.20 |
Yakov Nekrich | 4 | 450 | 38.94 |