Title
Minimization of Gini impurity via connections with the k-means problem.
Abstract
The Gini impurity is one of the measures used to select attribute in Decision Trees/Random Forest construction. In this note we discuss connections between the problem of computing the partition with minimum Weighted Gini impurity and the $k$-means clustering problem. Based on these connections we show that the computation of the partition with minimum Weighted Gini is a NP-Complete problem and we also discuss how to obtain new algorithms with provable approximation for the Gini Minimization problem.
Year
Venue
Field
2018
arXiv: Data Structures and Algorithms
Minimization problem,Discrete mathematics,Decision tree,k-means clustering,Minification,Partition (number theory),Cluster analysis,Random forest,Mathematics,Computation
DocType
Volume
Citations 
Journal
abs/1810.00029
0
PageRank 
References 
Authors
0.34
0
2
Name
Order
Citations
PageRank
Eduardo Sany Laber122927.12
Lucas Murtinho201.01