Title
Is Error-Based Pruning Redeemable?
Abstract
Error based pruning can be used to prune a decision tree and it does not require the use of validation data. It is implemented in the widely used C4.5 decision tree software. It uses a parameter, the certainty factor, that aects the size of the pruned tree. Several researchers have compared error based pruning with other approaches, and have shown results that suggest that error based pruning results in larger trees that give no increase in accuracy. They further suggest that as more data is added to the training set, the tree size after applying error based pruning continues to grow even though there is no increase in accuracy. It appears that these results were obtained with the default certainty factor value. Here, we show that varying the certainty factor allows signicantly smaller trees to be obtained with minimal or no accuracy loss. Also, the growth of tree size with added data can be halted with an appropriate choice of certainty factor. Methods of determining the certainty factor are discussed for both small and large data sets. Experimental results support the conclusion that error based pruning can be used to produce appropriately sized trees with good accuracy when compared with reduced error pruning.
Year
DOI
Venue
2003
10.1142/S0218213003001228
International Journal on Artificial Intelligence Tools
Keywords
Field
DocType
pruning,error based pruning,decision tree,reduced error pruning.
Decision tree,Data mining,Data set,Killer heuristic,Computer science,Software,Artificial intelligence,Pruning,Training set,Pattern recognition,Principal variation search,Pruning (decision trees),Machine learning
Journal
Volume
Issue
Citations 
12
3
1
PageRank 
References 
Authors
0.38
6
5
Name
Order
Citations
PageRank
Lawrence O. Hall15543335.87
Kevin W. Bowyer211121734.33
Robert E. Banfield335817.16
Steven Eschrich48910.81
Richard Collins5102.43