Title | ||
---|---|---|
Classification Potential vs. Classification Accuracy: A Comprehensive Study of Evolutionary Algorithms with Biomedical Datasets |
Abstract | ||
---|---|---|
Biomedical datasets pose a unique challenge for machine learning and data mining techniques to extract accurate, comprehensible
and hidden knowledge from them. In this paper, we investigate the role of a biomedical dataset on the classification accuracy
of an algorithm. To this end, we quantify the complexity of a biomedical dataset in terms of its missing values, imbalance
ratio, noise and information gain. We have performed our experiments using six well-known evolutionary rule learning algorithms
– XCS, UCS, GAssist, cAnt-Miner, SLAVE and Ishibuchi – on 31 publicly available biomedical datasets. The results of our experiments
and statistical analysis show that GAssist gives better classification results on majority of biomedical datasets among the
compared schemes but cannot be categorized as the best classifier. Moreover, our analysis reveals that the nature of a biomedical
dataset – not the selection of evolutionary algorithm – plays a major role in determining the classification accuracy of a
dataset. We further show that noise is a dominating factor in determining the complexity of a dataset and it is inversely
proportional to the classification accuracy of all evaluated algorithms. Towards the end, we provide researchers with a meta-classification
model that can be used to determine the classification potential of a dataset on the basis of its complexity measures.
|
Year | DOI | Venue |
---|---|---|
2009 | 10.1007/978-3-642-17508-4_9 | IWLCS |
Keywords | Field | DocType |
performance measures.,evolutionary rule learning algorithms,classification,biomedical datasets,machine learning,missing values,data mining,statistical analysis,evolutionary algorithm,information gain | Data mining,Evolutionary algorithm,Computer science,Information gain,Artificial intelligence,Missing data,Classifier (linguistics),Machine learning,Statistical analysis | Conference |
Citations | PageRank | References |
4 | 0.37 | 25 |
Authors | ||
2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Ajay Kumar Tanwani | 1 | 66 | 9.07 |
Muddassar Farooq | 2 | 1221 | 83.47 |