Title
Fast search algorithms for position specific scoring matrices
Abstract
Fast search algorithms for finding good instances of patterns given as position specific scoring matrices are developed, and some empirical results on their performance on DNA sequences are reported. The algorithms basically generalize the Aho-Corasick, filtration, and superalphabet techniques of string matching to the scoring matrix search. As compared to the naive search, our algorithms can be faster by a factor which is proportional to the length of the pattern. In our experimental comparison of different algorithms the new algorithms were clearly faster than the naive method and also faster than the well-known lookahead scoring algorithm. The Aho-Corasick technique is the fastest for short patterns and high significance thresholds of the search. For longer patterns the filtration method is better while the superalphabet technique is the best for very long patterns and low significance levels. We also observed that the actual speed of all these algorithms is very sensitive to implementation details.
Year
DOI
Venue
2007
10.1007/978-3-540-71233-6_19
BIRD
Keywords
Field
DocType
superalphabet technique,dna sequence,filtration method,position specific scoring matrix,naive method,aho-corasick technique,high significance threshold,scoring matrix search,low significance level,naive search,aho corasick,string matching,search algorithm
Position-Specific Scoring Matrices,String searching algorithm,Incremental heuristic search,Search algorithm,Pattern recognition,Computer science,Matrix (mathematics),Scoring algorithm,Artificial intelligence,Machine learning
Conference
Volume
ISSN
Citations 
4414
0302-9743
9
PageRank 
References 
Authors
0.66
16
3
Name
Order
Citations
PageRank
Cinzia Pizzi113915.73
Pasi Rastas2577.13
Esko Ukkonen32953468.50