Title | ||
---|---|---|
A novel method of protein sequence classification based on oligopeptide frequency analysis and its application to search for functional sites and to domain localization. |
Abstract | ||
---|---|---|
A new method for distinguishing among protein families based on the analysis of oligopeptide composition of amino acid sequences is presented. It is assumed that any protein family can be characterized by a set of essential oligopeptides (oligopeptide vocabulary). A simple approach to find such a vocabulary is suggested. It is shown that comparison of the vocabularies can distinguish among different families and the latter from random sequences. This comparison can be successfully made with a small set of frequencies of 25 dipeptides (or tripeptides). No preliminary alignment is necessary. It is established that characteristic peptides are located in the regions of functional value, as shown for GTP-binding domains of the translation elongation factors. It is demonstrated that this method is reasonably efficient for localizing functional domains in the amino acid sequences. The average error of prediction does not exceed three or four amino acid residues as shown for several functional domains. |
Year | DOI | Venue |
---|---|---|
1993 | 10.1093/bioinformatics/9.1.17 | Computer Applications in the Biosciences |
Keywords | Field | DocType |
frequency analysis,protein sequence | Sequence alignment,Protein family,Elongation factor,Protein sequencing,Computer science,Peptide,Oligopeptide,Homology (biology),Bioinformatics,Vocabulary | Journal |
Volume | Issue | ISSN |
9 | 1 | 0266-7061 |
Citations | PageRank | References |
13 | 1.46 | 0 |
Authors | ||
2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Victor V. Solovyev | 1 | 193 | 35.93 |
Kira S. Makarova | 2 | 57 | 5.84 |