Title
Multilevel biological characterization of exomic variants at the protein level significantly improves the identification of their deleterious effects.
Abstract
Motivation: There are now many predictors capable of identifying the likely phenotypic effects of single nucleotide variants (SNVs) or short in-frame Insertions or Deletions (INDELs) on the increasing amount of genome sequence data. Most of these predictors focus on SNVs and use a combination of features related to sequence conservation, biophysical, and/or structural properties to link the observed variant to either neutral or disease phenotype. Despite notable successes, the mapping between genetic variants and their phenotypic effects is riddled with levels of complexity that are not yet fully understood and that are often not taken into account in the predictions, despite their promise of significantly improving the prediction of deleterious mutants. Results: We present DEOGEN, a novel variant effect predictor that can handle both missense SNVs and in-frame INDELs. By integrating information from different biological scales and mimicking the complex mixture of effects that lead from the variant to the phenotype, we obtain significant improvements in the variant-effect prediction results. Next to the typical variant-oriented features based on the evolutionary conservation of the mutated positions, we added a collection of protein-oriented features that are based on functional aspects of the gene affected. We cross-validated DEOGEN on 36 825 polymorphisms, 20 821 deleterious SNVs, and 1038 INDELs from SwissProt. The multilevel contextualization of each (variant, protein) pair in DEOGEN provides a 10% improvement of MCC with respect to current state-of-the-art tools.
Year
DOI
Venue
2016
10.1093/bioinformatics/btw094
BIOINFORMATICS
Field
DocType
Volume
Missense mutation,Conserved sequence,Gene,Phenotype,Biology,Genetic variation,Whole genome sequencing,INDEL Mutation,Bioinformatics,Genetics,Indel
Journal
32
Issue
ISSN
Citations 
12
1367-4803
2
PageRank 
References 
Authors
0.38
8
5
Name
Order
Citations
PageRank
Daniele Raimondi182.95
Andrea M. Gazzo220.71
Marianne Rooman313712.00
Tom Lenaerts427653.44
Wim F. Vranken511219.85