Title
Multivariate Feature Selection using Random Subspace Classifiers for Gene Expression Data
Abstract
Gene expression analysis techniques identify important genes that predict specified outcomes based on sample characteristics. Given the small sample sizes common to these studies and the large dimensionality of the data, feature selection methods are essential. In addition, cancer-related expression analysis often involves imbalanced datasets due to rare forms of disease. Popular methods of feature selection employ univariate techniques to identify the features most suitable for analysis. We propose a multivariate technique for selecting accurate subsets of features using an approach based on random subspaces. The random subspace method is used to explore random combinations of features and only subspaces that produce accurate classifiers are retained. The method is tested on two independent gene expression datasets and compared with a univariate approach. The multivariate feature selection method resulted in a 33% improvement in classification accuracy overall and 90% improvement in classification accuracy for the minority class.
Year
DOI
Venue
2007
10.1109/BIBE.2007.4375685
Boston, MA
Keywords
Field
DocType
cancer,cellular biophysics,genetics,medical computing,molecular biophysics,cancer,gene expression,multivariate feature selection,random subspace classifiers,classifiers,feature selection,gene expression,microarray,random subspaces
Data mining,Feature selection,Computer science,Random subspace method,Artificial intelligence,Subspace topology,Pattern recognition,Multivariate statistics,Curse of dimensionality,Linear subspace,Bioinformatics,Univariate,Machine learning,Sample size determination
Conference
ISBN
Citations 
PageRank 
978-1-4244-1509-0
2
0.38
References 
Authors
0
4
Name
Order
Citations
PageRank
Vidya P. Kamath120.38
Lawrence O. Hall25543335.87
Yeatman, Timothy J.320.38
Steven Eschrich48910.81