Abstract | ||
---|---|---|
Bagging (Breiman, 1994a) is a technique that tries toimprove a learning algorithm‘s performance by using bootstrapreplicates of the training set (Efron & Tibshirani, 1993, Efron, 1979). The computational requirements for estimating the resultantgeneralization error on a test set by means of cross-validation areoften prohibitive, for leave-one-out cross-validation one needs totrain the underlying algorithm on the order of mν times, wherem is the size of the training set and ν is the number ofreplicates. This paper presents several techniques for estimatingthe generalization error of a bagged learning algorithm withoutinvoking yet more training of the underlying learning algorithm(beyond that of the bagging itself), as is required bycross-validation-based estimation. These techniques all exploit thebias-variance decomposition (Geman, Bienenstock & Doursat, 1992, Wolpert, 1996). The best of ourestimators also exploits stacking (Wolpert, 1992). In a set ofexperiments reported here, it was found to be more accurate than boththe alternative cross-validation-based estimator of the baggedalgorithm‘s error and the cross-validation-based estimator of theunderlying algorithm‘s error. This improvement was particularlypronounced for small test sets. This suggests a novel justificationfor using bagging—more accurate estimation of the generalizationerror than is possible without bagging. |
Year | DOI | Venue |
---|---|---|
1999 | 10.1023/A:1007519102914 | Machine Learning |
Keywords | Field | DocType |
Bagging,cross-validation,stacking,generalization error,bootstrap | Training set,Pattern recognition,Computer science,Generalization error,Artificial intelligence,Cross-validation,Bootstrapping (electronics),Machine learning,Estimator,Test set | Journal |
Volume | Issue | ISSN |
35 | 1 | 1573-0565 |
Citations | PageRank | References |
17 | 19.67 | 4 |
Authors | ||
2 |
Name | Order | Citations | PageRank |
---|---|---|---|
David H. Wolpert | 1 | 4334 | 591.07 |
William G. Macready | 2 | 161 | 39.07 |