Title
A Comparison of Ensemble Creation Techniques
Abstract
We experimentally evaluated bagging and six other randomization-based ensemble tree methods. Bagging uses randomization to create multiple training sets. Other approaches, such as Randomized C4.5, apply randomization in selecting a test at a given node of a tree. Then there are approaches, such as random forests and random subspaces, that apply randomization in the selection of attributes to be used in building the tree. On the other hand boosting incrementally builds classifiers by focusing on examples misclassified by existing classifiers. Experiments were performed on 34 publicly available data sets. While each of the other six approaches has some strengths, we find that none of them is consistently more accurate than standard bagging when tested for statistical significance.
Year
DOI
Venue
2004
10.1007/978-3-540-25966-4_22
Lecture Notes in Computer Science
Keywords
Field
DocType
decision tree,random forest
Random tree,Decision tree,Data set,Computer science,Randomization,Artificial intelligence,Boosting (machine learning),Random forest,Machine learning,Bootstrapping (electronics),Statistical hypothesis testing
Conference
Volume
ISSN
Citations 
3077
0302-9743
24
PageRank 
References 
Authors
2.06
13
6
Name
Order
Citations
PageRank
Robert E. Banfield135817.16
Lawrence O. Hall25543335.87
Kevin W. Bowyer311121734.33
Divya Bhadoria4443.83
W. Philip Kegelmeyer53498146.54
Steven Eschrich68910.81