Abstract | ||
---|---|---|
Gene expression data usually contain a large number of genes, but a small number of samples. Feature selection for gene expression data aims at finding a set of genes that best discriminates biological samples of different types. In this paper, we propose a two-stage selection algorithm for genomic data by combining MRMR (Minimum Redundancy–Maximum Relevance) and GA (Genetic Algorithm). In the first stage, MRMR is used to filter noisy and redundant genes in high-dimensional microarray data. In the second stage, the GA uses the classifier accuracy as a fitness function to select the highly discriminating genes. The proposed method is tested for tumor classification on five open datasets: NCI, Lymphoma, Lung, Leukemia and Colon using Support Vector Machine (SVM) and Naïve Bayes (NB) classifiers. The comparison of the MRMR-GA with MRMR filter and GA wrapper shows that our method is able to find the smallest gene subset that gives the most classification accuracy in leave-one-out cross-validation (LOOCV). |
Year | DOI | Venue |
---|---|---|
2011 | 10.1007/s10115-010-0288-x | Knowl. Inf. Syst. |
Keywords | DocType | Volume |
feature selection,genomic data,redundant gene,gene expression data,feature selection · genetic algorithm · mrmr · support vector machine · naïve bayes classifier · loocv,classification accuracy,smallest gene subset,ga wrapper,mrmr filter,high-dimensional microarray data,classifier accuracy,two-stage gene selection scheme,fitness function,support vector machine,genetic algorithm,microarray data,gene selection,leave one out cross validation,bayes classifier | Journal | 26 |
Issue | ISSN | Citations |
3 | 0219-3116 | 56 |
PageRank | References | Authors |
1.44 | 31 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Ali El Akadi | 1 | 56 | 1.44 |
Aouatif Amine | 2 | 85 | 9.29 |
Abdeljalil El Ouardighi | 3 | 56 | 1.44 |
Driss Aboutajdine | 4 | 589 | 88.82 |