Title
Probabilistic cluster structure ensemble
Abstract
Cluster structure ensemble focuses on integrating multiple cluster structures extracted from different datasets into a unified cluster structure, instead of aligning the individual labels from the clustering solutions derived from multiple homogenous datasets in the cluster ensemble framework. In this article, we design a novel probabilistic cluster structure ensemble framework, referred to as Gaussian mixture model based cluster structure ensemble framework (GMMSE), to identify the most representative cluster structure from the dataset. Specifically, GMMSE first applies the bagging approach to produce a set of variant datasets. Then, a set of Gaussian mixture models are used to capture the underlying cluster structures of the datasets. GMMSE applies K-means to initialize the values of the parameters of the Gaussian mixture model, and adopts the Expectation Maximization approach (EM) to estimate the parameter values of the model. Next, the components of the Gaussian mixture models are viewed as new data samples which are used to construct the representative matrix capturing the relationships among components. The similarity between two components corresponding to their respective Gaussian distributions is measured by the Bhattycharya distance function. Afterwards, GMMSE constructs a graph based on the new data samples and the representative matrix, and searches for the most representative cluster structure. Finally, we also design four criteria to assign the data samples to their corresponding clusters based on the unified cluster structure. The experimental results show that (i) GMMSE works well on synthetic datasets and real datasets in the UCI machine learning repository. (ii) GMMSE outperforms most of the previous cluster ensemble approaches.
Year
DOI
Venue
2014
10.1016/j.ins.2014.01.030
Inf. Sci.
Keywords
Field
DocType
multiple cluster structure,gaussian mixture model,cluster ensemble framework,novel probabilistic cluster structure,cluster structure ensemble,probabilistic cluster structure ensemble,representative matrix,unified cluster structure,representative cluster structure,corresponding cluster,cluster structure ensemble framework
Cluster (physics),Pattern recognition,Matrix (mathematics),Expectation–maximization algorithm,Metric (mathematics),Gaussian,Artificial intelligence,Probabilistic logic,Cluster analysis,Mixture model,Machine learning,Mathematics
Journal
Volume
ISSN
Citations 
267,
0020-0255
20
PageRank 
References 
Authors
0.60
44
7
Name
Order
Citations
PageRank
Zhiwen Yu12753220.67
Le Li215810.10
Hau-San Wong3100886.89
Jane You41885136.93
Guoqiang Han543943.27
Yunjun Gao686289.71
Guoxian Yu723421.81