Title
A hybrid approach to speed-up the k-means clustering method.
Abstract
k-means clustering method is an iterative partition-based method which for finite data-sets converges to a solution in a finite time. The running time of this method grows linearly with respect to the size of the data-set. Many variants have been proposed to speed-up the conventional k-means clustering method. In this paper, we propose a prototype-based hybrid approach to speed-up the k-means clustering method. The proposed method, first partitions the data-set into small clusters (grouplets), which are of varying sizes. Each grouplet is represented by a prototype. Later, the set of prototypes is partitioned into k clusters using the modified k-means method. The modified k-means clustering method is similar to the conventional k-means method but it avoids empty clusters (the clusters to which no pattern is assigned) in the iterative process. In each cluster of prototypes, each prototype is replaced by its corresponding set of patterns (which formed the grouplet) to derive a partition of the data-set. Since this partition of the data-set can deviate from the partition obtained using the conventional k-means method over the entire data-set, a correcting step is proposed. Both theoretically and experimentally, the conventional k-means method and the proposed hybrid method (augmented with the correcting step) are shown to yield the same result (provided, the initial k seed points are same). But, the proposed method is much faster than the conventional one. Experimentally, the proposed method is compared with the conventional method and the other recent methods that are proposed to speed-up the k-means method.
Year
DOI
Venue
2013
10.1007/s13042-012-0079-7
Int. J. Machine Learning & Cybernetics
Keywords
Field
DocType
k-means clustering method, Prototypes, Hybrid clustering, Leaders clustering method
k-medians clustering,Fuzzy clustering,k-means clustering,Correlation clustering,Iterative and incremental development,Algorithm,Consensus clustering,Cluster analysis,Mathematics,Single-linkage clustering
Journal
Volume
Issue
ISSN
4
2
1868-808X
Citations 
PageRank 
References 
14
0.60
27
Authors
3
Name
Order
Citations
PageRank
T. Hitendra Sarma1262.16
P. Viswanath214811.77
B. Eswara Reddy37811.25