A Comparison of Decision Tree Ensemble Creation Techniques - Citegraph

Paper Info

Title
A Comparison of Decision Tree Ensemble Creation Techniques

Abstract
We experimentally evaluate bagging and seven other randomization-based approaches to creating an ensemble of decision tree classifiers. Statistical tests were performed on experimental results from 57 publicly available data sets. When cross-validation comparisons were tested for statistical significance, the best method was statistically more accurate than bagging on only eight of the 57 data sets. Alternatively, examining the average ranks of the algorithms across the group of data sets, we find that boosting, random forests, and randomized trees are statistically significantly better than bagging. Because our results suggest that using an appropriate ensemble size is important, we introduce an algorithm that decides when a sufficient number of classifiers has been created for an ensemble. Our algorithm uses the out-of-bag error estimate, and is shown to result in an accurate ensemble for those methods that incorporate bagging into the construction of the ensemble.

Year	DOI	Venue
2007	10.1109/TPAMI.2007.2	IEEE Trans. Pattern Anal. Mach. Intell.
Keywords	Field	DocType
classifier ensembles, bagging, boosting, random forests, random subspaces, performance evaluation	Decision tree,Pattern recognition,Computer science,Supervised learning,Artificial intelligence,Boosting (machine learning),Random forest,Cross-validation,Ensemble learning,Decision tree learning,Statistical hypothesis testing,Machine learning	Journal
Volume	Issue	ISSN
29	1	0162-8828
Citations	PageRank	References
138	5.12	14
Authors
4

Search Limit

100138

Authors (4 rows)

Cited by (100 rows)

References (14 rows)

Name	Order	Citations	PageRank
Robert E. Banfield	1	358	17.16
Lawrence O. Hall	2	5543	335.87
Kevin W. Bowyer	3	11121	734.33
W. Philip Kegelmeyer	4	3498	146.54

1