Title
A double pruning algorithm for classification ensembles
Abstract
This article introduces a double pruning algorithm that can be used to reduce the storage requirements, speed-up the classification process and improve the performance of parallel ensembles. A key element in the design of the algorithm is the estimation of the class label that the ensemble assigns to a given test instance by polling only a fraction of its classifiers. Instead of applying this form of dynamical (instance-based) pruning to the original ensemble, we propose to apply it to a subset of classifiers selected using standard ensemble pruning techniques. The pruned subensemble is built by first modifying the order in which classifiers are aggregated in the ensemble and then selecting the first classifiers in the ordered sequence. Experiments in benchmark problems illustrate the improvements that can be obtained with this technique. Specifically, using a bagging ensemble of 101 CART trees as a starting point, only the 21 trees of the pruned ordered ensemble need to be stored in memory. Depending on the classification task, on average, only 5 to 12 of these 21 classifiers are queried to compute the predictions. The generalization performance achieved by this double pruning algorithm is similar to pruned ordered bagging and significantly better than standard bagging.
Year
DOI
Venue
2010
10.1007/978-3-642-12127-2_11
MCS
Keywords
Field
DocType
standard ensemble pruning technique,double pruning algorithm,classification task,standard bagging,cart tree,parallel ensemble,classification ensemble,generalization performance,classification process,bagging ensemble,original ensemble,ensemble learning,decision trees,decision tree
Pruning algorithm,Decision tree,Pattern recognition,Random subspace method,Computer science,Principal variation search,Polling,Pruning (decision trees),Artificial intelligence,Ensemble learning,Machine learning,Pruning
Conference
Volume
ISSN
ISBN
5997
0302-9743
3-642-12126-8
Citations 
PageRank 
References 
6
0.45
15
Authors
4
Name
Order
Citations
PageRank
Víctor Soto1332.60
Gonzalo Martínez-Muñoz252423.76
Daniel Hernández-Lobato344026.10
Alberto Suárez448722.33