Abstract | ||
---|---|---|
The data collected from a typical microarray experiment usually consists of tens of samples and thousands of genes (i.e., features). Usually only a small subset of features is relevant and non-redundant to differentiate the samples. Identifying an optimal subset of relevant genes is crucial for accurate classification of samples. In this paper, we propose a method for relevant gene subset selection for microarray gene expression data. Our method is based on gap tolerant classifier, a variation of support vector machine, and uses a hill-climbing search strategy. Unlike most other hill-climbing approaches, where classification accuracies are used as a criterion for feature selection, the proposed method uses a mixture of accuracy and SVM margin to select features. Our experimental results show that this strategy is effective both in selecting relevant and in eliminating redundant features. |
Year | DOI | Venue |
---|---|---|
2006 | 10.1007/11823728_49 | DaWaK |
Keywords | Field | DocType |
hill-climbing search strategy,feature selection,support vector machine,optimal subset,hill-climbing approach,classification accuracy,relevant gene subset selection,small subset,microarray data,relevant gene,accurate classification,data collection,hill climbing | Data warehouse,Data mining,Feature selection,Computer science,Support vector machine,Redundancy (engineering),Microarray analysis techniques,Knowledge extraction,Classifier (linguistics),DNA microarray | Conference |
Volume | ISSN | ISBN |
4081 | 0302-9743 | 3-540-37736-0 |
Citations | PageRank | References |
0 | 0.34 | 11 |
Authors | ||
2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Xiao Bing Huang | 1 | 0 | 0.68 |
Jian Tang | 2 | 526 | 148.30 |