Title
DDIG-in: detecting disease-causing genetic variations due to frameshifting indels and nonsense mutations employing sequence and structural properties at nucleotide and protein levels.
Abstract
Motivation: Frameshifting (FS) indels and nonsense (NS) variants disrupt the protein-coding sequence downstream of the mutation site by changing the reading frame or introducing a premature termination codon, respectively. Despite such drastic changes to the protein sequence, FS indels and NS variants have been discovered in healthy individuals. How to discriminate disease-causing from neutral FS indels and NS variants is an understudied problem. Results: We have built a machine learning method called DDIG-in (FS) based on real human genetic variations from the Human Gene Mutation Database (inherited disease-causing) and the 1000 Genomes Project (GP) (putatively neutral). The method incorporates both sequence and predicted structural features and yields a robust performance by 10-fold cross-validation and independent tests on both FS indels and NS variants. We showed that human-derived NS variants and FS indels derived from animal orthologs can be effectively employed for independent testing of our method trained on human-derived FS indels. DDIG-in (FS) achieves a Matthews correlation coefficient (MCC) of 0.59, a sensitivity of 86%, and a specificity of 72% for FS indels. Application of DDIG-in (FS) to NS variants yields essentially the same performance (MCC of 0.43) as a method that was specifically trained for NS variants. DDIG-in (FS) was shown to make a significant improvement over existing techniques.
Year
DOI
Venue
2015
10.1093/bioinformatics/btu862
BIOINFORMATICS
Field
DocType
Volume
Gene,Matthews correlation coefficient,Protein sequencing,Biology,Genetic variation,1000 Genomes Project,Bioinformatics,Nonsense mutation,Genetics,Mutation,Indel
Journal
31
Issue
ISSN
Citations 
10
1367-4803
4
PageRank 
References 
Authors
0.50
11
9
Name
Order
Citations
PageRank
Lukas Folkman1202.83
Yuedong Yang240.50
Zhixiu Li340.50
Bela Stantic419838.54
abdul sattar51389185.70
Matthew Mort640.50
David N Cooper7282.92
Yunlong Liu826721.81
Ying Zhou982.14