Title
Iterative Feature Perturbation As A Gene Selector For Microarray Data
Abstract
Gene-expression microarray datasets often consist of a limited number of samples with a large number of gene-expression measurements, usually on the order of thousands. Therefore, dimensionality reduction is critical prior to any classification task. In this work, the iterative feature perturbation method (IFP), an embedded gene selector, is introduced and applied to four microarray cancer datasets: colon cancer, leukemia, Moffitt colon cancer, and lung cancer. We compare results obtained by IFP to those of support vector machine-recursive feature elimination (SVM-RFE) and the t-test as a feature filter using a linear support vector machine as the base classifier. Analysis of the intersection of gene sets selected by the three methods across the four datasets was done. Additional experiments included an initial pre-selection of the top 200 genes based on their p values. IFP and SVM-RFE were then applied on the reduced feature sets. These results showed up to 3.32% average performance improvement for IFP across the four datasets. A statistical analysis (using the Friedman/Holm test) for both scenarios showed the highest accuracies came from the t-test as a filter on experiments without gene pre-selection. IFP and SVM-RFE had greater classification accuracy after gene pre-selection. Analysis showed the t-test is a good gene selector for microarray data. IFP and SVM-RFE showed performance improvement on a reduced by t-test dataset. The IFP approach resulted in comparable or superior average class accuracy when compared to SVM-RFE on three of the four datasets. The same or similar accuracies can be obtained with different sets of genes.
Year
DOI
Venue
2012
10.1142/S0218001412600038
INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE
Keywords
Field
DocType
Feature selection, microarray analysis, gene selection, t-test, feature perturbation
Data mining,Dimensionality reduction,Gene,Feature selection,Microarray analysis techniques,Artificial intelligence,Classifier (linguistics),Pattern recognition,Support vector machine,Machine learning,Mathematics,Performance improvement,Statistical analysis
Journal
Volume
Issue
ISSN
26
5
0218-0014
Citations 
PageRank 
References 
4
0.40
15
Authors
5
Name
Order
Citations
PageRank
Juana Canul-Reich193.60
Lawrence O. Hall25543335.87
Dmitry B. Goldgof32021198.90
John N. Korecki471.13
Steven Eschrich58910.81