Abstract | ||
---|---|---|
This article introduces a double pruning algorithm that can be used to reduce the storage requirements, speed-up the classification process and improve the performance of parallel ensembles. A key element in the design of the algorithm is the estimation of the class label that the ensemble assigns to a given test instance by polling only a fraction of its classifiers. Instead of applying this form of dynamical (instance-based) pruning to the original ensemble, we propose to apply it to a subset of classifiers selected using standard ensemble pruning techniques. The pruned subensemble is built by first modifying the order in which classifiers are aggregated in the ensemble and then selecting the first classifiers in the ordered sequence. Experiments in benchmark problems illustrate the improvements that can be obtained with this technique. Specifically, using a bagging ensemble of 101 CART trees as a starting point, only the 21 trees of the pruned ordered ensemble need to be stored in memory. Depending on the classification task, on average, only 5 to 12 of these 21 classifiers are queried to compute the predictions. The generalization performance achieved by this double pruning algorithm is similar to pruned ordered bagging and significantly better than standard bagging. |
Year | DOI | Venue |
---|---|---|
2010 | 10.1007/978-3-642-12127-2_11 | MCS |
Keywords | Field | DocType |
standard ensemble pruning technique,double pruning algorithm,classification task,standard bagging,cart tree,parallel ensemble,classification ensemble,generalization performance,classification process,bagging ensemble,original ensemble,ensemble learning,decision trees,decision tree | Pruning algorithm,Decision tree,Pattern recognition,Random subspace method,Computer science,Principal variation search,Polling,Pruning (decision trees),Artificial intelligence,Ensemble learning,Machine learning,Pruning | Conference |
Volume | ISSN | ISBN |
5997 | 0302-9743 | 3-642-12126-8 |
Citations | PageRank | References |
6 | 0.45 | 15 |
Authors | ||
4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Víctor Soto | 1 | 33 | 2.60 |
Gonzalo Martínez-Muñoz | 2 | 524 | 23.76 |
Daniel Hernández-Lobato | 3 | 440 | 26.10 |
Alberto Suárez | 4 | 487 | 22.33 |