Title
A dynamic Bayesian network approach to protein secondary structure prediction.
Abstract
Protein secondary structure prediction method based on probabilistic models such as hidden Markov model (HMM) appeals to many because it provides meaningful information relevant to sequence-structure relationship. However, at present, the prediction accuracy of pure HMM-type methods is much lower than that of machine learning-based methods such as neural networks (NN) or support vector machines (SVM).In this paper, we report a new method of probabilistic nature for protein secondary structure prediction, based on dynamic Bayesian networks (DBN). The new method models the PSI-BLAST profile of a protein sequence using a multivariate Gaussian distribution, and simultaneously takes into account the dependency between the profile and secondary structure and the dependency between profiles of neighboring residues. In addition, a segment length distribution is introduced for each secondary structure state. Tests show that the DBN method has made a significant improvement in the accuracy compared to other pure HMM-type methods. Further improvement is achieved by combining the DBN with an NN, a method called DBNN, which shows better Q3 accuracy than many popular methods and is competitive to the current state-of-the-arts. The most interesting feature of DBN/DBNN is that a significant improvement in the prediction accuracy is achieved when combined with other methods by a simple consensus.The DBN method using a Gaussian distribution for the PSI-BLAST profile and a high-ordered dependency between profiles of neighboring residues produces significantly better prediction accuracy than other HMM-type probabilistic methods. Owing to their different nature, the DBN and NN combine to form a more accurate method DBNN. Future improvement may be achieved by combining DBNN with a method of SVM type.
Year
DOI
Venue
2008
10.1186/1471-2105-9-49
BMC Bioinformatics
Keywords
Field
DocType
bayes theorem,neural network,probabilistic model,computer simulation,secondary structure,support vector machine,hidden markov model,probabilistic method,gaussian distribution,proteins,machine learning,bioinformatics,protein sequence,dynamic bayesian network,microarrays,amino acid sequence,algorithms
Conditional probability distribution,Computer science,Artificial intelligence,Probabilistic logic,Artificial neural network,Bayes' theorem,Pattern recognition,Support vector machine,Bioinformatics,Hidden Markov model,Protein secondary structure,Machine learning,Dynamic Bayesian network
Journal
Volume
Issue
ISSN
9
1
1471-2105
Citations 
PageRank 
References 
42
0.91
8
Authors
3
Name
Order
Citations
PageRank
Xin-Qiu Yao1522.69
Huaiqiu Zhu216215.27
Zhen-Su She31259.43