Title
Deterministic Document Exchange Protocols, and Almost Optimal Binary Codes for Edit Errors
Abstract
We study two basic problems regarding edit errors (insertions and deletions). The first one is document exchange, where two parties Alice and Bob hold two strings x and y with a bounded edit distance k. The goal is to have Alice send a short sketch to Bob, so that Bob can recover x based on y and the sketch. The second one is the fundamental problem of designing error correcting codes for edit errors, where the goal is to construct an explicit code to transmit a message x through a channel that can add at most k worst case insertions and deletions, so that the original message x can be successfully recovered at the other end of the channel. Both problems have been extensively studied for decades, and in this paper we focus on deterministic document exchange protocols and binary codes for insertions and deletions (insdel codes). If the length of x is n, then it is known that for small k (e.g., k ≤ n/4), in both problems the optimal sketch size or the optimal number of redundant bits is Θ(k log n/k). In particular, this implies the existence of binary codes that can correct ε fraction of insertions and deletions with rate 1-Θ(ε log (1/ε). However, known constructions are far from achieving these bounds. In this paper we significantly improve previous results on both problems. For document exchange, we give an efficient deterministic protocol with sketch size O(k log <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> n/k). This significantly improves the previous best known deterministic protocol, which has sketch size O(k <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> + k log <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> n) [2]. For binary insdel codes, we obtain the following results: 1) An explicit binary insdel code which encodes an n-bit message x against k errors with redundancy O(k log <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> n/k). In particular this implies an explicit family of binary insdel codes that can correct ε fraction of insertions and deletions with rate 1-O(ε log <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> 1/(1-ε))=1-Õ(ε). This significantly improves the previous best known result which only achieves rate 1-Õ(√ε) [11], [10], and is optimal up to a log (1/ε) factor. 1) An explicit binary insdel code which encodes an n-bit message x against k errors with redundancy O(k log n). This significantly improves the previous best known result of [4], which only works for constant k and has redundancy O(k <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> log k log n); and that of [2], which has redundancy O(k <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> + k log <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> n). Our code has optimal redundancy for k ≤ n <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1-α</sup> , any constant 0 <; α <; 1. This is the first explicit construction of binary insdel codes that has optimal redundancy for a wide range of error parameters k, and this brings our understanding of binary insdel codes much closer to that of standard binary error correcting codes. In obtaining our results we introduce several new techniques. Most notably, we introduce the notion of ε-self matching hash functions and ε-synchronization hash functions. We believe our techniques can have further applications in the literature.
Year
DOI
Venue
2018
10.1109/FOCS.2018.00028
2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS)
Keywords
DocType
Volume
Edit errors, Document exchange, Error correcting codes
Conference
abs/1804.05776
ISSN
ISBN
Citations 
1523-8288
978-1-5386-4231-3
3
PageRank 
References 
Authors
0.38
11
4
Name
Order
Citations
PageRank
Kuan Cheng174.86
Zhengzhong Jin243.78
Xin Li341.74
Ke Wu4978.63