Abstract | ||
---|---|---|
Edit distance has been extensively studied for the past several years. Nevertheless, no linear-time algorithm is known to compute the edit distance between two strings, or even to approximate it to within a modest factor. Furthermore, for various natural algorithmic problems such as low-distortion embeddings into normed spaces, approximate nearest-neighbor schemes, and sketching algorithms, known results for the edit distance are rather weak. We develop algorithms that solve gap versions of the edit distance problem: given two strings of length n with the promise that their edit distance is either at most k or greater than \ell, decide which of the two holds. We present two sketching algorithms for gap versions of edit distance. Our first algorithm solves the k vs.(kn)^{{2 \mathord{\left/ {\vphantom {2 3}} \right. \kern-\nulldelimiterspace} 3}} gap problem, using a constant size sketch. A more involved algorithm solves the stronger k vs. \ell gap problem, where \ell can be as small as O(k虏) 驴 still with a constant sketch 驴 but works only for strings that are mildly "non-repetitive". Finally, we develop an n^{{3 \mathord{\left/ {\vphantom {3 7}} \right. \kern-\nulldelimiterspace} 7}}-approximation quasi-linear time algorithm for edit distance, improving the previous best factor of n^{{3 \mathord{\left/ {\vphantom {3 4}} \right. \kern-\nulldelimiterspace} 4}} [5]; if the input strings are assumed to be non-repetitive, then the approximation factor can be strengthened to n^{{1 \mathord{\left/ {\vphantom {1 3}} \right. \kern-\nulldelimiterspace} 3}}. |
Year | DOI | Venue |
---|---|---|
2004 | 10.1109/FOCS.2004.14 | FOCS |
Keywords | Field | DocType |
approximation theory,computational complexity,string matching,approximate nearest-neighbor schemes,approximation factor,approximation quasilinear time algorithm,edit distance approximation,gap versions,low-distortion embeddings,natural algorithmic problems,nonrepetitive strings,sketching algorithms | String-to-string correction problem,Edit distance,String searching algorithm,Discrete mathematics,Combinatorics,Wagner–Fischer algorithm,Approximation theory,Jaro–Winkler distance,Damerau–Levenshtein distance,Mathematics,Computational complexity theory | Conference |
ISSN | ISBN | Citations |
0272-5428 | 0-7695-2228-9 | 48 |
PageRank | References | Authors |
1.99 | 19 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Ziv Bar-Yossef | 1 | 1776 | 118.00 |
Jayram, T.S. | 2 | 61 | 3.47 |
Robert Krauthgamer | 3 | 1417 | 103.80 |
Ravi Kumar | 4 | 13932 | 1642.48 |