Title
Using an Ensemble of Support Vector Machine Classifiers to Predict Protein Supersecondary Structural Motifs.
Abstract
The success of human genome project and the rapid increase in the number of protein sequences entering into data bank have stimulated a challenging frontier: how to develop a fast and accurate method to predict the supersecondary structural motifs of protein. It could help to reduce the ever-widening gap between known sequences and unknown structure. To address this problem, a new method for prediction of protein supersecondary structural motifs is proposed in this paper. This method combines amino acid basic compositions with dipeptide components for feature representation of protein sequential patterns. An ensemble classifier based on Support vector machines is used to predict four kinds of supersecondary structural motifs in protein sequences. Total twenty-four increments of diversity are defined for each supersecondary structural motif. The method is trained and tested on ArchDB40 dataset containing 3088 proteins. The highest overall accuracy for the training dataset and the independent testing dataset are 74.8% and 69.3% respectively. © 2011 ACADEMY PUBLISHER.
Year
DOI
Venue
2011
10.4304/jcp.6.10.2053-2059
JCP
Keywords
Field
DocType
diversity measure,ensemble classifier,supersecondary structural motifs,support vector machines
Data bank,Diversity measure,Pattern recognition,Computer science,Support vector machine,Structural motif,Artificial intelligence,Human genome,Classifier (linguistics),Machine learning
Journal
Volume
Issue
Citations 
6
10
1
PageRank 
References 
Authors
0.36
7
3
Name
Order
Citations
PageRank
Dongsheng Zou1192.65
Zhongshi He215518.36
Yuan Yan331.44