Abstract | ||
---|---|---|
Protein secondary structure prediction method based on probabilistic models such as hidden Markov model (HMM) appeals to many because it provides meaningful information relevant to sequence-structure relationship. However, at present, the prediction accuracy of pure HMM-type methods is much lower than that of machine learning-based methods such as neural networks (NN) or support vector machines (SVM).In this paper, we report a new method of probabilistic nature for protein secondary structure prediction, based on dynamic Bayesian networks (DBN). The new method models the PSI-BLAST profile of a protein sequence using a multivariate Gaussian distribution, and simultaneously takes into account the dependency between the profile and secondary structure and the dependency between profiles of neighboring residues. In addition, a segment length distribution is introduced for each secondary structure state. Tests show that the DBN method has made a significant improvement in the accuracy compared to other pure HMM-type methods. Further improvement is achieved by combining the DBN with an NN, a method called DBNN, which shows better Q3 accuracy than many popular methods and is competitive to the current state-of-the-arts. The most interesting feature of DBN/DBNN is that a significant improvement in the prediction accuracy is achieved when combined with other methods by a simple consensus.The DBN method using a Gaussian distribution for the PSI-BLAST profile and a high-ordered dependency between profiles of neighboring residues produces significantly better prediction accuracy than other HMM-type probabilistic methods. Owing to their different nature, the DBN and NN combine to form a more accurate method DBNN. Future improvement may be achieved by combining DBNN with a method of SVM type. |
Year | DOI | Venue |
---|---|---|
2008 | 10.1186/1471-2105-9-49 | BMC Bioinformatics |
Keywords | Field | DocType |
bayes theorem,neural network,probabilistic model,computer simulation,secondary structure,support vector machine,hidden markov model,probabilistic method,gaussian distribution,proteins,machine learning,bioinformatics,protein sequence,dynamic bayesian network,microarrays,amino acid sequence,algorithms | Conditional probability distribution,Computer science,Artificial intelligence,Probabilistic logic,Artificial neural network,Bayes' theorem,Pattern recognition,Support vector machine,Bioinformatics,Hidden Markov model,Protein secondary structure,Machine learning,Dynamic Bayesian network | Journal |
Volume | Issue | ISSN |
9 | 1 | 1471-2105 |
Citations | PageRank | References |
42 | 0.91 | 8 |
Authors | ||
3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Xin-Qiu Yao | 1 | 52 | 2.69 |
Huaiqiu Zhu | 2 | 162 | 15.27 |
Zhen-Su She | 3 | 125 | 9.43 |