Title | ||
---|---|---|
DDIG-in: detecting disease-causing genetic variations due to frameshifting indels and nonsense mutations employing sequence and structural properties at nucleotide and protein levels. |
Abstract | ||
---|---|---|
Motivation: Frameshifting (FS) indels and nonsense (NS) variants disrupt the protein-coding sequence downstream of the mutation site by changing the reading frame or introducing a premature termination codon, respectively. Despite such drastic changes to the protein sequence, FS indels and NS variants have been discovered in healthy individuals. How to discriminate disease-causing from neutral FS indels and NS variants is an understudied problem. Results: We have built a machine learning method called DDIG-in (FS) based on real human genetic variations from the Human Gene Mutation Database (inherited disease-causing) and the 1000 Genomes Project (GP) (putatively neutral). The method incorporates both sequence and predicted structural features and yields a robust performance by 10-fold cross-validation and independent tests on both FS indels and NS variants. We showed that human-derived NS variants and FS indels derived from animal orthologs can be effectively employed for independent testing of our method trained on human-derived FS indels. DDIG-in (FS) achieves a Matthews correlation coefficient (MCC) of 0.59, a sensitivity of 86%, and a specificity of 72% for FS indels. Application of DDIG-in (FS) to NS variants yields essentially the same performance (MCC of 0.43) as a method that was specifically trained for NS variants. DDIG-in (FS) was shown to make a significant improvement over existing techniques. |
Year | DOI | Venue |
---|---|---|
2015 | 10.1093/bioinformatics/btu862 | BIOINFORMATICS |
Field | DocType | Volume |
Gene,Matthews correlation coefficient,Protein sequencing,Biology,Genetic variation,1000 Genomes Project,Bioinformatics,Nonsense mutation,Genetics,Mutation,Indel | Journal | 31 |
Issue | ISSN | Citations |
10 | 1367-4803 | 4 |
PageRank | References | Authors |
0.50 | 11 | 9 |
Name | Order | Citations | PageRank |
---|---|---|---|
Lukas Folkman | 1 | 20 | 2.83 |
Yuedong Yang | 2 | 4 | 0.50 |
Zhixiu Li | 3 | 4 | 0.50 |
Bela Stantic | 4 | 198 | 38.54 |
abdul sattar | 5 | 1389 | 185.70 |
Matthew Mort | 6 | 4 | 0.50 |
David N Cooper | 7 | 28 | 2.92 |
Yunlong Liu | 8 | 267 | 21.81 |
Ying Zhou | 9 | 8 | 2.14 |