Title
Comparison of methods for meta-dimensional data analysis using in silico and biological data sets
Abstract
Recent technological innovations have catalyzed the generation of a massive amount of data at various levels of biological regulation, including DNA, RNA and protein. Due to the complex nature of biology, the underlying model may only be discovered by integrating different types of high-throughput data to perform a "meta-dimensional" analysis. For this study, we used simulated gene expression and genotype data to compare three methods that show potential for integrating different types of data in order to generate models that predict a given phenotype: the Analysis Tool for Heritable and Environmental Network Associations (ATHENA), Random Jungle (RJ), and Lasso. Based on our results, we applied RJ and ATHENA sequentially to a biological data set that consisted of genome-wide genotypes and gene expression levels from lymphoblastoid cell lines (LCLs) to predict cytotoxicity. The best model consisted of two SNPs and two gene expression variables with an r-squared value of 0.32.
Year
DOI
Venue
2012
10.1007/978-3-642-29066-4_12
EvoBIO
Keywords
Field
DocType
best model,biological data set,athena sequentially,simulated gene expression,biological data,gene expression level,high-throughput data,different type,meta-dimensional data analysis,gene expression variable,genotype data,biological regulation,systems biology,evolutionary computation,neural networks,data integration,human genetics
Data integration,Biological data,Biology,Lasso (statistics),Systems biology,Data type,Single-nucleotide polymorphism,Bioinformatics,Biological regulation,In silico
Conference
Citations 
PageRank 
References 
6
0.55
8
Authors
6
Name
Order
Citations
PageRank
Emily Rose Holzinger1334.50
Scott M. Dudek220626.27
Alex T. Frase3253.33
Brooke L. Fridley4264.66
Prabhakar Chalise5121.92
Marylyn D. Ritchie669286.79