Abstract | ||
---|---|---|
A practical suboptimal (variable source coding) algorithm for lossy data compression is presented. This scheme is based on approximate string matching, and it naturally extends the lossless Lempel-Ziv (1977) data compression scheme. Among others we consider the typical length of an approximately repeated pattern within the first n positions of a stationary mixing sequence where D percent of mismatches is allowed. We prove that there exists a constant r0(D) such that the length of such an approximately repeated pattern converges in probability to 1/r0(D) log n (pr.) but it almost surely oscillates between 1/r-∞(D) log n and 2/r1(D) log n, where r -∞(D)>r0(D)>r1(D)/2 are some constants. These constants are natural generalizations of Renyi entropies to the lossy environment. More importantly, we show that the compression ratio of a lossy data compression scheme based on such an approximate pattern matching is asymptotically equal to r0(D). We also establish the asymptotic behavior of the so-called approximate waiting time Nl which is defined as the time until a pattern of length C repeats approximately for the first time. We prove that log Nl/l→r0(D) (pr.) as l→∞. In general, r0(D)>R(D) where R(D) is the rate distortion function. Thus for stationary mixing sequences we settle in the negative the problem investigated by Steinberg and Gutman by showing that a lossy extension of the Wyner-Ziv (1989) scheme cannot be optimal |
Year | DOI | Venue |
---|---|---|
1997 | 10.1109/18.623143 | IEEE Transactions on Information Theory |
Keywords | Field | DocType |
approximation theory,data compression,encoding,entropy,pattern matching,probability,rate distortion theory,sequences,string matching,Renyi entropies,approximate pattern matching,approximate string matching,approximate waiting time,asymptotic behavior,compression ratio,lossless Lempel-Ziv data compression,lossy environment,probability,rate distortion function,stationary mixing sequence,stationary mixing sequences,suboptimal algorithm,suboptimal lossy data compression,variable source coding algorithm | Binary logarithm,Discrete mathematics,Combinatorics,Lossy compression,Approximate string matching,Almost surely,Data compression,Rate–distortion theory,Asymptotic analysis,Mathematics,Lossless compression | Journal |
Volume | Issue | ISSN |
43 | 5 | 0018-9448 |
Citations | PageRank | References |
52 | 4.27 | 23 |
Authors | ||
3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Luczak, T. | 1 | 52 | 4.27 |
W. Szpankowski | 2 | 381 | 31.59 |
Tomasz Łuczak | 3 | 225 | 40.26 |