Title
On approximate pattern matching with thresholds.
Abstract
In the traditional version of the problem of approximate pattern matching, a pattern symbol is considered to match a text symbol if the two symbols are equal. Such a notion of exact equality is not suitable for situations where the text and pattern symbols are imprecise, e.g., obtained from an analog source, distorted by additive noise, etc. In such situations it is more appropriate to consider two alphabet symbols to match even if they are not equal, as long as they do not differ by more than a given threshold θ. The goal is then to compute the number of matches of the length-M pattern with all length-M substrings of the length-N text, i.e., to compute a vector of N−M+1 scores, where the ith score is the number of matches between the pattern and the substring that begins at text position i. The main result of this paper is to show that this threshold version of the problem can be solved by recursively solving 3+2log⁡θ instances of the traditional (i.e., zero-threshold) version of the problem, which is much-studied in the literature and for which there are many efficient (typically randomized) solutions of time complexity close to O(Nlog⁡M). This paper's result therefore implies the first randomized O(Nlog⁡M(log⁡θ+1)) solution for the threshold version of the problem. It also implies that any future improvement to the traditional (zero-threshold) version of the problem automatically translates into a similar improvement to the arbitrary-threshold case. Furthermore, we show that the factor Ω(log⁡θ) is tight if use our recursive framework.
Year
DOI
Venue
2017
10.1016/j.ipl.2017.03.001
Information Processing Letters
Keywords
Field
DocType
Design of algorithms,Pattern matching,Threshold,Recursion
Discrete mathematics,Substring,Combinatorics,Symbol,Time complexity,Pattern matching,Mathematics,Recursion,Alphabet
Journal
Volume
ISSN
Citations 
123
0020-0190
2
PageRank 
References 
Authors
0.35
8
2
Name
Order
Citations
PageRank
Peng Zhang150.74
Mikhail J. Atallah23828340.54