The Role of Biomedical Dataset in Classification - Citegraph

Paper Info

Title
The Role of Biomedical Dataset in Classification

Abstract
In this paper, we investigate the role of a biomedical dataset on the classification accuracy of an algorithm. We quantify the complexity of a biomedical dataset using five complexity measures: correlation-based feature selection subset merit, noise, imbalance ratio, missing values and information gain. The effect of these complexity measures on classification accuracy is evaluated using five diverse machine learning algorithms: J48 (decision tree), SMO (support vector machines), Naive Bayes (probabilistic), IBk (instance based learner) and JRIP (rule-based induction). The results of our experiments show that noise and correlation-based feature selection subset merit --- not a particular choice of algorithm --- play a major role in determining the classification accuracy. In the end, we provide researchers with a meta-model and an empirical equation to estimate the classification potential of a dataset on the basis of its complexity. This well help researchers to efficiently pre-process the dataset for automatic knowledge extraction.

Year	DOI	Venue
2009	10.1007/978-3-642-02976-9_51	AIME '87
Keywords	Field	DocType
automatic knowledge extraction,biomedical dataset,complexity measure,major role,diverse machine,correlation-based feature selection subset,classification accuracy,classification potential,decision tree,naive bayes,machine learning,rule based,support vector machine,feature selection,meta model,information gain,missing values,knowledge extraction	Decision tree,Data mining,Naive Bayes classifier,Feature selection,Computer science,Support vector machine,C4.5 algorithm,Artificial intelligence,Knowledge extraction,Missing data,Probabilistic logic,Machine learning	Conference
Volume	ISSN	Citations
5651	0302-9743	11
PageRank	References	Authors
0.57	3	2

Authors (2 rows)

Cited by (11 rows)

References (3 rows)

Name	Order	Citations	PageRank
Ajay Kumar Tanwani	1	66	9.07
Muddassar Farooq	2	1221	83.47

1