Title
A search for common patterns in many sequences.
Abstract
A new approach to search for common patterns in many sequences is presented. The idea is that one sequence from the set of sequences to be compared is considered as a 'basic' one and all its similarities with other sequences are found. Multiple similarities are then reconstructed using these data. This approach allows one to search for similar segments which can differ in both substitutions and deletions/insertions. These segments can be situated at different positions in various sequences. No regions of complete or strong similarity within the segments are required. The other parts of the sequences can have no similarity at all. The only requirement is that the similar segments can be found in all the sequences (or in the majority of them, given the common segments are present in the basic sequence). Working time of an algorithm presented is proportional to n.L2 when n sequences of length L are analyzed. The algorithm proposed is implemented as programs for the IBM-PC and IBM/370. Its applications to the analysis of biopolymer primary structures as well as the dependence of the results on the choice of basic sequence are discussed.
Year
DOI
Venue
1992
10.1093/bioinformatics/8.1.57
COMPUTER APPLICATIONS IN THE BIOSCIENCES
Field
DocType
Volume
Nucleic acid sequence,Computer science,Fortran,Bioinformatics,Nucleic acid,Microcomputer
Journal
8
Issue
ISSN
Citations 
1
0266-7061
27
PageRank 
References 
Authors
41.33
5
1
Name
Order
Citations
PageRank
Mikhail A. Roytberg111454.66