Title
Comparison of reuse strategies for case-based classification in bioinformatics
Abstract
Bioinformatics offers an interesting challenge for data mining algorithms given the high dimensionality of its data and the comparatively small set of samples. Case-based classification algorithms have been successfully applied to classify bioinformatics data and often serve as a reference for other algorithms. Therefore this paper proposes to study, on some of the most benchmarked datasets in bioinformatics, the performance of different reuse strategies in case-based classification in order to make methodological recommendations for applying these algorithms to this domain. In conclusion, k-nearest-neighbor (kNN) classifiers coupled with between-group to within-group sum of squares (BSS/WSS) feature selection can perform as well and even better than the best benchmarked algorithms to date. However the reuse strategy chosen played a major role to optimize the algorithms. In particular, the optimization of both the number k of neighbors and the number of features accounted was key to improving classification accuracy.
Year
DOI
Venue
2011
10.1007/978-3-642-23291-6_29
ICCBR
Keywords
Field
DocType
data mining,benchmarked datasets,case-based classification algorithm,benchmarked algorithm,classification accuracy,case-based classification,number k,different reuse strategy,reuse strategy,bioinformatics data,k nearest neighbor,classification,bioinformatics,feature selection,reuse
Data mining,Feature selection,Computer science,Artificial intelligence,Data mining algorithm,Small set,k-nearest neighbors algorithm,Reuse,Curse of dimensionality,Bioinformatics,Statistical classification,Explained sum of squares,Machine learning
Conference
Volume
ISSN
Citations 
6880
0302-9743
0
PageRank 
References 
Authors
0.34
13
1
Name
Order
Citations
PageRank
Isabelle Bichindaritz153255.74