Title
Error-based pruning of decision trees grown on very large data sets can work!
Abstract
It has been asserted that, using traditional pruning methods, growing decision trees with increasingly larger amounts of training data will result in larger tree sizes even when accuracy does not increase. With regard to error-based pruning, the experimental data used to illustrate this assertion have apparently been obtained using the default setting for pruning strength; in particular, using the default certainty factor of 25 in the C4.5 decision tree implementation. We show that, in general, an appropriate setting of the certainty factor for error-based pruning will cause decision tree size to plateau when accuracy is not increasing with more training data.
Year
DOI
Venue
2002
10.1109/TAI.2002.1180809
Tools with Artificial Intelligence, 2002.
Keywords
Field
DocType
decision trees,C4.5 decision tree implementation,error-based decision tree pruning,very large data sets
Decision tree,Grafting (decision trees),Pattern recognition,Computer science,Principal variation search,Artificial intelligence,Pruning (decision trees),Machine learning,Alternating decision tree,Decision tree learning,Decision stump,Incremental decision tree
Conference
ISSN
ISBN
Citations 
1082-3409
0-7695-1849-4
3
PageRank 
References 
Authors
0.51
6
4
Name
Order
Citations
PageRank
Lawrence O. Hall15543335.87
Richard Collins230.51
Kevin W. Bowyer311121734.33
Robert E. Banfield435817.16