Title
Predicting deleterious non-synonymous single nucleotide polymorphisms in signal peptides based on hybrid sequence attributes.
Abstract
Signal peptides play a crucial role in various biological processes, such as localization of cell surface receptors, translocation of secreted proteins and cell-cell communication. However, the amino acid mutation in signal peptides, also called non-synonymous single nucleotide polymorphisms (nsSNPs or SAPs) may lead to the loss of their functions. In the present study, a computational method was proposed for predicting deleterious nsSNPs in signal peptides based on random forest (RF) by incorporating position specific scoring matrix (PSSM) profile, SignalP score and physicochemical properties. These features were optimized by the maximum relevance minimum redundancy (mRMR) method. Then, a cost matrix was used to minimize the effect of the imbalanced data classification problem that usually occurred in nsSNPs prediction. The method achieved an overall accuracy of 84.5% and the area under the ROC curve (AUC) of 0.822 by Jackknife test, when the optimal subset included 10 features. Furthermore, on the same dataset, we compared our predictor with other existing methods, including R-score-based method and D-score-based methods, and the result of our method was superior to those of the two methods. The satisfactory performance suggests that our method is effective in predicting the deleterious nsSNPs in signal peptides.
Year
DOI
Venue
2012
10.1016/j.compbiolchem.2011.12.001
Computational Biology and Chemistry
Keywords
Field
DocType
single nucleotide polymorphism,jackknife test,deleterious non-synonymous,deleterious nssnps,existing method,cost matrix,hybrid sequence attribute,signal peptides,computational method,d-score-based method,r-score-based method,position specific scoring matrix,nssnps prediction,random forest,signal peptide
Jackknife resampling,Cost matrix,Pattern recognition,Biology,Redundancy (engineering),Single-nucleotide polymorphism,Signal peptide,Artificial intelligence,Data classification,Bioinformatics,Random forest,Mutation
Journal
Volume
ISSN
Citations 
36
1476-928X
0
PageRank 
References 
Authors
0.34
19
9
Name
Order
Citations
PageRank
Wenli Qin101.01
Yizhou Li2694.70
Juan Li300.34
Lezheng Yu4171.28
Di Wu500.68
Runyu Jing631.42
Xuemei Pu743.97
Yanzhi Guo8113.83
Menglong Li99411.85