Title
More Accurate Models For Detecting Gene-Gene Interactions From Public Expression Compendia
Abstract
The fast accumulation of public gene expression data-made possible by high-throughput technology-provides us an unprecedented opportunity to identify functionally related genes by analyzing their co-expression patterns. However, these data are typically noisy and highly heterogeneous, complicating their use in constructing co-expression network in large expression compendia. Previous studies suggested that the collective gene expression pattern can be better modeled by Gaussian mixtures. This motivates our present work, which proposes Multimodal Mutual Information (MMI) to reconstruct gene co-expression network from public gene expression data. MMI assumes gene pair following bivariate Gaussian mixture models and categories the samples into unique bins with respect to their expression magnitude. Two kinds of correlations in MMI are computed and aggregated to capture both discretized dependency and the expression correlation for each bin. Through extensive simulations, MMI outperforms other approaches with respect to calculating gene-gene interactions, regardless of the level of noise or strength of interactions. The advance of MMI is further validated by three real problems: 1. Infer novel gene functions by their connections with the well-documented genes. We apply principle component analysis to the correlated matrix generated by MMI and Pearson correlation and construct transcriptional components to evaluate gene-gene interactions. MMI enables 1.7 times more eligible transcriptional components than Pearson correlation, which can be used to predict gene functions. 2. Prioritize candidate genes for an affected pedigree. MMI calculates the interactions between candidate and disease established genes and explores KIF1A as the new causal gene of pure hereditary spastic paraparesis. 3. Detect disease "hot genes". MMI identifies ANK2 as the "hot gene" for autism spectrum disorders by evaluating its co-expression with other disease susceptible genes derived from trio-based exome sequencing data.
Year
Venue
Keywords
2016
2016 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM)
MMI,Public gene expression data,Prioritize candidate gene
Field
DocType
ISSN
Pearson product-moment correlation coefficient,Gene,Candidate gene,Computer science,Correlation,Mutual information,Bioinformatics,Mixture model,Exome sequencing,Principal component analysis
Conference
2156-1125
Citations 
PageRank 
References 
1
0.36
9
Authors
3
Name
Order
Citations
PageRank
Lu Zhang141.57
Jia Xing Chen210.36
Shuai Cheng Li318430.25