Abstract | ||
---|---|---|
Evidence accumulation clustering (EAC) is a clustering combination method in which a pair-wise similarity matrix (the so-called co-association matrix) is learnt from a clustering ensemble. This coassociation matrix counts the co-occurrences (in the same cluster) of pairs of objects, thus avoiding the cluster correspondence problem faced by many other clustering combination approaches. Starting from the observation that co-occurrences are a special type of dyads, we propose to model co-association using a generative aspect model for dyadic data. Under the proposed model, the extraction of a consensus clustering corresponds to solving a maximum likelihood estimation problem, which we address using the expectation-maximization algorithm. We refer to the resulting method as probabilistic ensemble clustering algorithm (PEnCA). Moreover, the fact that the problem is placed in a probabilistic framework allows using model selection criteria to automatically choose the number of clusters. To compare our method with other combination techniques (also based on probabilistic modeling of the clustering ensemble problem), we performed experiments with synthetic and real benchmark data-sets, showing that the proposed approach leads to competitive results. |
Year | DOI | Venue |
---|---|---|
2011 | 10.1007/978-3-642-24471-1_8 | SIMBAD |
Keywords | Field | DocType |
maximum likelihood estimation problem,cluster correspondence problem,generative aspect model,clustering combination method,clustering ensemble problem,generative dyadic aspect model,model selection criterion,evidence accumulation clustering,clustering ensemble,clustering combination approach,clustering,unsupervised learning,model selection | Fuzzy clustering,Data mining,CURE data clustering algorithm,Pattern recognition,Correlation clustering,Determining the number of clusters in a data set,Consensus clustering,Constrained clustering,Artificial intelligence,Cluster analysis,Mathematics,Single-linkage clustering | Conference |
Volume | ISSN | Citations |
7005 | 0302-9743 | 2 |
PageRank | References | Authors |
0.37 | 12 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
André Lourenço | 1 | 312 | 45.33 |
Ana Fred | 2 | 216 | 17.07 |
Mário A. T. Figueiredo | 3 | 7203 | 561.50 |