Title
A Support Based Initialization Algorithm for Categorical Data Clustering
Abstract
AbstractSeveral initial center selection algorithms are proposed in the literature for numerical data, but the values of the categorical data are unordered so, these methods are not applicable to a categorical data set. This article investigates the initial center selection process for the categorical data and after that present a new support based initial center selection algorithm. The proposed algorithm measures the weight of unique data points of an attribute with the help of support and then integrates these weights along the rows, to get the support of every row. Further, a data object having the largest support is chosen as an initial center followed by finding other centers that are at the greatest distance from the initially selected center. The quality of the proposed algorithm is compared with the random initial center selection method, Cao's method, Wu method and the method introduced by Khan and Ahmad. Experimental analysis on real data sets shows the effectiveness of the proposed algorithm.
Year
DOI
Venue
2018
10.4018/JITR.2018040104
Periodicals
Keywords
Field
DocType
Accuracy, Clustering, K-modes, Precision, Recall, Support
Data mining,Computer science,Categorical variable,Initialization,Cluster analysis
Journal
Volume
Issue
ISSN
11
2
1938-7857
Citations 
PageRank 
References 
0
0.34
9
Authors
2
Name
Order
Citations
PageRank
Ajay Kumar120.70
Shishir Kumar27817.06