An Efficient Method To Estimate Bagging‘s Generalization Error - Citegraph

Paper Info

Title
An Efficient Method To Estimate Bagging‘s Generalization Error

Abstract
Bagging (Breiman, 1994a) is a technique that tries toimprove a learning algorithm‘s performance by using bootstrapreplicates of the training set (Efron & Tibshirani, 1993, Efron, 1979). The computational requirements for estimating the resultantgeneralization error on a test set by means of cross-validation areoften prohibitive, for leave-one-out cross-validation one needs totrain the underlying algorithm on the order of mν times, wherem is the size of the training set and ν is the number ofreplicates. This paper presents several techniques for estimatingthe generalization error of a bagged learning algorithm withoutinvoking yet more training of the underlying learning algorithm(beyond that of the bagging itself), as is required bycross-validation-based estimation. These techniques all exploit thebias-variance decomposition (Geman, Bienenstock & Doursat, 1992, Wolpert, 1996). The best of ourestimators also exploits stacking (Wolpert, 1992). In a set ofexperiments reported here, it was found to be more accurate than boththe alternative cross-validation-based estimator of the baggedalgorithm‘s error and the cross-validation-based estimator of theunderlying algorithm‘s error. This improvement was particularlypronounced for small test sets. This suggests a novel justificationfor using bagging—more accurate estimation of the generalizationerror than is possible without bagging.

Year	DOI	Venue
1999	10.1023/A:1007519102914	Machine Learning
Keywords	Field	DocType
Bagging,cross-validation,stacking,generalization error,bootstrap	Training set,Pattern recognition,Computer science,Generalization error,Artificial intelligence,Cross-validation,Bootstrapping (electronics),Machine learning,Estimator,Test set	Journal
Volume	Issue	ISSN
35	1	1573-0565
Citations	PageRank	References
17	19.67	4
Authors
2

Authors (2 rows)

Cited by (17 rows)

References (4 rows)

Name	Order	Citations	PageRank
David H. Wolpert	1	4334	591.07
William G. Macready	2	161	39.07

1