Title
From cluster ensemble to structure ensemble
Abstract
This paper investigates the problem of integrating multiple structures which are extracted from different sets of data points into a single unified structure. We first propose a new generalized concept called structure ensemble for the fusion of multiple structures. Unlike traditional cluster ensemble approaches the main objective of which is to align individual labels obtained from different clustering solutions, the structure ensemble approach focuses on how to unify the structures obtained from different data sources. Based on this framework, a new structure ensemble approach called the probabilistic bagging based structure ensemble approach (BSEA) is designed, which integrates the bagging technique, the force based self-organizing map (FBSOM) and the normalized cut algorithm into the proposed framework. BSEA views structures obtained from different datasets generated by the bagging technique as nodes in a graph, and adopts graph theory to find the most representative structure. In addition, the force based self-organizing map (FBSOM), which is a generalized form of SOM, is proposed to serve as the basic clustering algorithm in the structure ensemble framework. Finally, a new external index called correlation index (CI), which considers the correlation relationship of both the similarity and dissimilarity between the predicted solution and the true solution, is proposed to evaluate the performance of BSEA. The experiments show that (i) The performance of BSEA outperforms most of the state-of-the-art clustering approaches, and (ii) BSEA performs well on datasets from the UCI repository and real cancer gene expression profiles.
Year
DOI
Venue
2012
10.1016/j.ins.2012.02.019
Inf. Sci.
Keywords
Field
DocType
traditional cluster ensemble,representative structure,new structure ensemble approach,bagging technique,structure ensemble,self-organizing map,structure ensemble framework,structure ensemble approach,single unified structure,multiple structure
Data point,Graph theory,Graph,Data mining,Normalization (statistics),Correlation,Artificial intelligence,Probabilistic logic,Cluster analysis,Ensemble learning,Machine learning,Mathematics
Journal
Volume
ISSN
Citations 
198,
0020-0255
19
PageRank 
References 
Authors
0.56
681
4
Search Limit
100681
Name
Order
Citations
PageRank
Zhiwen Yu123118.51
Jane You21885136.93
Hau-San Wong3100886.89
Guoqiang Han443943.27