Title
Adapted pruning scheme for the framework of imbalanced data-sets.
Abstract
Learning from imbalanced data is attracting an increasing interest by the machine learning community. This is mainly due to the high number of real applications that are affected by this situation. The adaptation of the standard decision trees to deal with imbalanced data represents one of the important number of approaches that have been developed to address this problem. This adaptation has been proposed under three different perspectives: splitting criterion, assignment rule and pruning. In this paper, we focus our attention to the pruning of decision trees. We propose an adaptation of the standard pruning algorithm MCCP to address the skewed-data problem. Our contribution affects two levels: adaption of the metric used in selecting nodes to be firstly pruned and change of the evaluation measure used in selecting the best decision-tree through the pruning set. Our goal is to show that, contrary to the popular belief in the literature enquiring into the uselessness of decision tree pruning, an adaptive pruning technique for imbalanced situations is more efficient and more accurate towards the minority class. A total of twelve binary class data-sets having different imbalance ratio are used to test the performance of the proposed method. Experimental results show that the proposed post-pruning approach can increase the performance of imbalanced decision trees in terms of evaluation measures that are recent and appropriate for the context of imbalanced classification. (C) 2017 The Authors. Published by Elsevier B.V.
Year
DOI
Venue
2017
10.1016/j.procs.2017.08.060
Procedia Computer Science
Keywords
Field
DocType
imbalanced class,decision-tree pruning,skewed data-sets,IBA,B42
Pruning algorithm,Decision tree,Data mining,Data set,Computer science,Principal variation search,Artificial intelligence,Pruning (decision trees),Machine learning,Pruning,Binary number
Conference
Volume
ISSN
Citations 
112
1877-0509
3
PageRank 
References 
Authors
0.37
12
3
Name
Order
Citations
PageRank
Ikram Chaabane130.37
Radhouane Guermazi2235.55
Mohamed Hammami318130.54