Title
Mismatch sampling
Abstract
We reconsider the well-known problem of pattern matching under the Hamming distance. Previous approaches have shown how to count the number of mismatches efficiently, especially when a bound is known for the maximum Hamming distance. Our interest is different in that we wish to collect a random sample of mismatches of fixed size at each position in the text. Given a pattern p of length m and a text t of length n, we show how to sample with high probability up to c mismatches from every alignment of p and t in O((c+logn)(n+mlogm)logm) time. Further, we guarantee that the mismatches are sampled uniformly and can therefore be seen as representative of the types of mismatches that occur.
Year
DOI
Venue
2012
10.1016/j.ic.2012.02.007
Inf. Comput.
Keywords
DocType
Volume
high probability c mismatches,Hamming distance,length m,length n,m logm,maximum Hamming distance,pattern p,random sample,fixed size,previous approach,Mismatch Sampling
Journal
214,
Citations 
PageRank 
References 
2
0.37
7
Authors
5
Name
Order
Citations
PageRank
Raphaël Clifford126828.57
Klim Efremenko213515.31
Benny Porat3646.56
ely porat4100779.16
Amir Rothschild5493.46