Abstract | ||
---|---|---|
Consider an ordered, static tree T where each node has a label from alphabet Σ. Tree T may be of arbitrary degree and shape. Our goal is designing a compressed storage scheme of T that supports basic navigational operations among the immediate neighbors of a node (i.e. parent, ith child, or any child with some label,…) as well as more sophisticated path-based search operations over its labeled structure. We present a novel approach to this problem by designing what we call the XBW-transform of the tree in the spirit of the well-known Burrows-Wheeler transform for strings [1994]. The XBW-transform uses path-sorting to linearize the labeled tree T into two coordinated arrays, one capturing the structure and the other the labels. For the first time, by using the properties of the XBW-transform, our compressed indexes go beyond the information-theoretic lower bound, and support navigational and path-search operations over labeled trees within (near-)optimal time bounds and entropy-bounded space. Our XBW-transform is simple and likely to spur new results in the theory of tree compression and indexing, as well as interesting application contexts. As an example, we use the XBW-transform to design and implement a compressed index for XML documents whose compression ratio is significantly better than the one achievable by state-of-the-art tools, and its query time performance is order of magnitudes faster. |
Year | DOI | Venue |
---|---|---|
2009 | 10.1145/1613676.1613680 | J. ACM |
Keywords | DocType | Volume |
optimal time bound,path searching,support navigational,xml indexing,tree compression,static tree,ith child,tree navigation,burrows-wheeler transform,labeled tree compression,compression ratio,query time performance,labeled tree indexing,basic navigational operation,xml compression,XML document,arbitrary degree | Journal | 57 |
Issue | ISSN | Citations |
1 | 0004-5411 | 62 |
PageRank | References | Authors |
1.71 | 50 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Paolo Ferragina | 1 | 2220 | 130.64 |
Fabrizio Luccio | 2 | 593 | 87.43 |
Giovanni Manzini | 3 | 1584 | 111.42 |
S. Muthukrishnan | 4 | 8025 | 734.98 |