Abstract | ||
---|---|---|
Learning from imbalanced data is attracting an increasing interest by the machine learning community. This is mainly due to the high number of real applications that are affected by this situation. The adaptation of the standard decision trees to deal with imbalanced data represents one of the important number of approaches that have been developed to address this problem. This adaptation has been proposed under three different perspectives: splitting criterion, assignment rule and pruning. In this paper, we focus our attention to the pruning of decision trees. We propose an adaptation of the standard pruning algorithm MCCP to address the skewed-data problem. Our contribution affects two levels: adaption of the metric used in selecting nodes to be firstly pruned and change of the evaluation measure used in selecting the best decision-tree through the pruning set. Our goal is to show that, contrary to the popular belief in the literature enquiring into the uselessness of decision tree pruning, an adaptive pruning technique for imbalanced situations is more efficient and more accurate towards the minority class. A total of twelve binary class data-sets having different imbalance ratio are used to test the performance of the proposed method. Experimental results show that the proposed post-pruning approach can increase the performance of imbalanced decision trees in terms of evaluation measures that are recent and appropriate for the context of imbalanced classification. (C) 2017 The Authors. Published by Elsevier B.V. |
Year | DOI | Venue |
---|---|---|
2017 | 10.1016/j.procs.2017.08.060 | Procedia Computer Science |
Keywords | Field | DocType |
imbalanced class,decision-tree pruning,skewed data-sets,IBA,B42 | Pruning algorithm,Decision tree,Data mining,Data set,Computer science,Principal variation search,Artificial intelligence,Pruning (decision trees),Machine learning,Pruning,Binary number | Conference |
Volume | ISSN | Citations |
112 | 1877-0509 | 3 |
PageRank | References | Authors |
0.37 | 12 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Ikram Chaabane | 1 | 3 | 0.37 |
Radhouane Guermazi | 2 | 23 | 5.55 |
Mohamed Hammami | 3 | 181 | 30.54 |