Title
Guidelines to Select Machine Learning Scheme for Classification of Biomedical Datasets
Abstract
Biomedical datasets pose a unique challenge to machine learning and data mining algorithms for classification because of their high dimensionality, multiple classes, noisy data and missing values. This paper provides a comprehensive evaluation of a set of diverse machine learning schemes on a number of biomedical datasets. To this end, we follow a four step evaluation methodology: (1) pre-processing the datasets to remove any redundancy, (2) classification of the datasets using six different machine learning algorithms; Naive Bayes (probabilistic), multi-layer perceptron (neural network), SMO (support vector machine), IBk (instance based learner), J48 (decision tree) and RIPPER (rule-based induction), (3) bagging and boosting each algorithm, and (4) combining the best version of each of the base classifiers to make a team of classifiers with stacking and voting techniques. Using this methodology, we have performed experiments on 31 different biomedical datasets. To the best of our knowledge, this is the first study in which such a diverse set of machine learning algorithms are evaluated on so many biomedical datasets. The important outcome of our extensive study is a set of promising guidelines which will help researchers in choosing the best classification scheme for a particular nature of biomedical dataset.
Year
DOI
Venue
2009
10.1007/978-3-642-01184-9_12
EvoBIO
Keywords
Field
DocType
support vector machine,biomedical dataset,select machine learning scheme,diverse machine,biomedical datasets,comprehensive evaluation,different biomedical datasets,best version,best classification scheme,different machine,diverse set,classification,missing values,rule based,decision tree,naive bayes,multi layer perceptron,machine learning,neural network
Decision tree,Data mining,Online machine learning,Naive Bayes classifier,Computer science,Support vector machine,Boosting (machine learning),Artificial intelligence,Relevance vector machine,Artificial neural network,Perceptron,Machine learning
Conference
Volume
ISSN
Citations 
5483
0302-9743
18
PageRank 
References 
Authors
0.95
16
4
Name
Order
Citations
PageRank
Ajay Kumar Tanwani1669.07
Jamal Afridi2180.95
M. Zubair Shafiq354643.41
Muddassar Farooq4122183.47