Title
Sparse coding for feature selection on genome-wide association data
Abstract
Genome-wide association (GWA) studies provide large amounts of high-dimensional data. GWA studies aim to identify variables that increase the risk for a given phenotype. Univariate examinations have provided some insights, but it appears that most diseases are affected by interactions of multiple factors, which can only be identified through a multivariate analysis. However, multivariate analysis on the discrete, high-dimensional and low-sample-size GWA data is made more difficult by the presence of random effects and nonspecific coupling. In this work, we investigate the suitability of three standard techniques (p-values, SVM, PCA) for analyzing GWA data on several simulated datasets. We compare these standard techniques against a sparse coding approach; we demonstrate that sparse coding clearly outperforms the other approaches and can identify interacting factors in far higherdimensional datasets than the other three approaches.
Year
DOI
Venue
2010
10.1007/978-3-642-15819-3_44
ICANN (1)
Keywords
DocType
Volume
feature selection,high-dimensional data,simulated datasets,higherdimensional datasets,gwa study,standard technique,multivariate analysis,genome-wide association data,genome-wide association,low-sample-size gwa data,sparse coding approach,gwa data,random effects,sample size,sparse coding,genome wide association,high dimensional data,machine learning,snp
Conference
6352
ISSN
ISBN
Citations 
0302-9743
3-642-15818-8
1
PageRank 
References 
Authors
0.35
3
3
Name
Order
Citations
PageRank
Ingrid Brænne110.69
Kai Labusch21138.50
Amir Madany Mamlouk3379.52