Title
A hierarchical fuzzy cluster ensemble approach and its application to big data clustering
Abstract
Cluster ensembles organically integrate individual component methods which may utilise different parameter settings and features, and which may themselves be generated on the basis of different representations and learning mechanisms. Such a technique offers an effective means for aggregating multiple clustering results in order to improve the overall clustering accuracy and robustness. Many topics regarding cluster ensembles have been proposed and promising results are gained in the literature. To reinforce such development, this paper presents another cluster ensemble approach for fuzzy clustering, with an aim to be applied for clustering of big data. The proposed algorithm first generates fuzzy base clusters with respect to each data feature and then, employs a fuzzy hierarchical graph to represent the relationships between the resulting base clusters. Whilst the work employs fuzzy c-means and hierarchical clustering in generating base cluster and implementing consensus function respectively, when applied to large datasets it has lower time complexity than the original fuzzy c-means and hierarchical clustering. The resultant ensemble clustering mechanism is tested against traditional clustering methods on various benchmark datasets. Experimental results demonstrate that it generally outperforms crisp cluster ensembles and single linkage agglomerative clustering, in terms of accuracy in conjunction with time efficiency, thereby showing that it has the potential for application in clustering big data.
Year
DOI
Venue
2015
10.3233/IFS-141518
JOURNAL OF INTELLIGENT & FUZZY SYSTEMS
Keywords
Field
DocType
Fuzzy cluster ensemble,big data clustering,fuzzy c-means,hierarchical clustering,data mining
Hierarchical clustering,Canopy clustering algorithm,Fuzzy clustering,Data mining,CURE data clustering algorithm,Computer science,Consensus clustering,Artificial intelligence,Cluster analysis,Brown clustering,Machine learning,Single-linkage clustering
Journal
Volume
Issue
ISSN
28
6
1064-1246
Citations 
PageRank 
References 
13
0.59
26
Authors
3
Name
Order
Citations
PageRank
Pan Su18211.72
Changjing Shang221234.92
Qiang Shen386455.09