Abstract | ||
---|---|---|
Dynamic dictionary-based compression schemes are the most daily used data compression schemes since they appeared in the foundational paper of Ziv and Lempel in 1977, commonly referred to as LZ77. In dynamic setting, LZ77 considers a portion of the previous text as a dictionary and it uses a greedy approach to select, at each step, the longest match between the text and the dictionary. Compression is achieved by replacing matches with encoded dictionary pointers. LZ77 is the base of gzip, zip, rar, 7zip and many others compression software. All these compression schemes use variants of the greedy approach to parse (or factorise) the text into dictionary phrases. Greedy parsing optimality with respect to the number of phrases was proved by Storer et al. (1982) for unbounded LZ77-based dictionaries and by Cohn et al. (1996) for static suffix-closed dictionaries. The optimality of the greedy parsing was never proved for bounded size dictionary which is actually required by all of these practical schemes. In this article, we define the suffix-closed property for dynamic dictionaries, and we show that LZ77-based compression schemes, including the bounded dictionary variants, satisfy this property. Under this condition we prove the optimality of the greedy parsing as a variant of the proof by Cohn et al. |
Year | DOI | Venue |
---|---|---|
2012 | 10.1016/j.tcs.2014.01.013 | Theor. Comput. Sci. |
Keywords | DocType | Volume |
greedy approach,dictionary-based text compression,LZ77-based compression scheme,bounded size dictionary,compression software,bounded dictionary variant,greedy parsing,LZ77-based dictionary,dictionary phrase,greedy parsing optimality,data compression scheme,compression scheme | Journal | 525, |
ISSN | Citations | PageRank |
0304-3975 | 6 | 0.51 |
References | Authors | |
14 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Maxime Crochemore | 1 | 2655 | 281.75 |
Alessio Langiu | 2 | 46 | 6.52 |
Filippo Mignosi | 3 | 569 | 99.71 |