Title
A structure-based protocol for learning the family-specific mechanisms of membrane-binding domains.
Abstract
Peripheral membrane-targeting domain (MTD) families, such as C1-, C2- and PH domains, play a key role in signal transduction and membrane trafficking by dynamically translocating their parent proteins to specific plasma membranes when changes in lipid composition occur. It is, however, difficult to determine the subset of domains within families displaying this property, as sequence motifs signifying the membrane binding properties are not well defined. For this reason, procedures based on sequence similarity alone are often insufficient in computational identification of MTDs within families (yielding less than 65% accuracy even with a sequence identity of 70%).We present a machine learning protocol for determining membrane-targeting properties achieving 85-90% accuracy in separating binding and non-binding domains within families. Our model is based on features from both sequence and structure, thereby incorporation statistics obtained from the entire domain family and domain-specific physical quantities such as surface electrostatics. In addition, by using the enriched rules in alternating decision tree classifiers, we are able to determine the meaning of the assigned function labels in terms of biological mechanisms.The high accuracy of the learned models and good agreement between the rules discovered using the ADtree classifier and mechanisms reported in the literature reflect the value of machine learning protocols in both prediction and biological knowledge discovery. Our protocol can thus potentially be used as a general function annotation and knowledge mining tool for other protein domains.metador.bioengr.uic.eduhuilu@uic.edu.
Year
DOI
Venue
2012
10.1093/bioinformatics/bts409
Bioinformatics
Keywords
Field
DocType
family-specific mechanism,sequence motif,membrane-binding domain,high accuracy,biological mechanism,sequence similarity,c2-and ph domain,sequence identity,assigned function label,structure-based protocol,entire domain family,biological knowledge discovery,general function annotation,membrane proteins,artificial intelligence,static electricity
Membrane protein,Protein domain,Computer science,Sequence motif,Mechanism (biology),Protein Sorting Signals,Knowledge extraction,Bioinformatics,Classifier (linguistics),Alternating decision tree
Journal
Volume
Issue
ISSN
28
18
1367-4811
Citations 
PageRank 
References 
0
0.34
3
Authors
4
Name
Order
Citations
PageRank
Morten Källberg1292.90
Nitin Bhardwaj213410.05
Robert Langlois300.34
Hui Lu4496.27