Title
Finding new structural and sequence attributes to predict possible disease association of single amino acid polymorphism (SAP).
Abstract
The rapid accumulation of single amino acid polymorphisms (SAPs), also known as non-synonymous single nucleotide polymorphisms (nsSNPs), brings the opportunities and needs to understand and predict their disease association. Currently published attributes are limited, the detailed mechanisms governing the disease association of a SAP remain unclear and thus, further investigation of new attributes and improvement of the prediction are desired.A SAP dataset was compiled from the Swiss-Prot variant pages. We extracted and demonstrated the effectiveness of several new biologically informative attributes including the structural neighbor profiles that describe the SAP's microenvironment, nearby functional sites that measure the structure-based and sequence-based distances between the SAP site and its nearby functional sites, aggregation properties that measure the likelihood of protein aggregation and disordered regions that consider whether the SAP is located in structurally disordered regions. The new attributes provided insights into the mechanisms of the disease association of SAPs. We built a support vector machines (SVMs) classifier employing a carefully selected set of new and previously published attributes. Through a strict protein-level 5-fold cross-validation, we attained an overall accuracy of 82.61%, and an MCC of 0.60. Moreover, a web server was developed to provide a user-friendly interface for biologists.The web server is available at http://sapred.cbi.pku.edu.cn/
Year
DOI
Venue
2007
10.1093/bioinformatics/btm119
Bioinformatics
Keywords
Field
DocType
new attribute,web server,sap site,aggregation property,possible disease association,disease association,cn supplementary information,new biologically informative attribute,functional site,single amino acid polymorphism,sap dataset,disordered region,polymorphism,amino acid
Data mining,Disease Association,Amino acid,Computer science,Support vector machine,Polymorphism (computer science),Single-nucleotide polymorphism,Bioinformatics,Classifier (linguistics),Web server
Journal
Volume
Issue
ISSN
23
12
1367-4811
Citations 
PageRank 
References 
7
0.51
11
Authors
7
Name
Order
Citations
PageRank
Zhi-Qiang Ye1714.60
Shuqi Zhao2845.36
Ge Gao324620.82
Xiao-Qiao Liu4563.48
Robert E. Langlois5241.62
Hui Lu6111.00
Liping Wei726022.03