Title
Pidt: A Novel Decision Tree Algorithm Based On Parameterised Impurities And Statistical Pruning Approaches
Abstract
In the process of constructing a decision tree, the criteria for selecting the splitting attributes influence the performance of the model produced by the decision tree algorithm. The most well-known criteria such as Shannon entropy and Gini index, suffer from the lack of adaptability to the datasets. This paper presents novel splitting attribute selection criteria based on some families of parameterised impurities that we proposed here to be used in the construction of optimal decision trees. These criteria rely on families of strict concave functions that define the new generalised parameterised impurity measures which we applied in devising and implementing our PIDT novel decision tree algorithm. This paper proposes also the S-condition based on statistical permutation tests, whose purpose is to ensure that the reduction in impurity, or gain, for the selected attribute is statistically significant. We implemented the S-pruning procedure based on the S-condition, to prevent model overfitting. These methods were evaluated on a number of simulated and benchmark datasets. Experimental results suggest that by tuning the parameters of the impurity measures and by using our S-pruning method, we obtain better decision tree classifiers with the PIDT algorithm.
Year
DOI
Venue
2018
10.1007/978-3-319-92007-8_24
ARTIFICIAL INTELLIGENCE APPLICATIONS AND INNOVATIONS, AIAI 2018
Keywords
Field
DocType
Machine learning, Decision trees, Parameterised impurity measures, Concave functions, Optimisation, Preventing overfitting, Statistical pruning, Permutation test, Significance level
Decision tree,Optimal decision,Feature selection,Computer science,Permutation,Artificial intelligence,Overfitting,Resampling,Entropy (information theory),Machine learning,Decision tree learning
Conference
Volume
ISSN
Citations 
519
1868-4238
0
PageRank 
References 
Authors
0.34
6
5
Name
Order
Citations
PageRank
Daniel Stamate16636.68
Daniel Stamate26636.68
Wajdi Alghamdi300.34
Doina Logofatu41716.74
Alexander Zamyatin562.41