Abstract | ||
---|---|---|
We propose a new methodology for clustering data comprising multiple domains or parts, in such a way that the separate domains mutually supervise each other within a semi-supervised learning framework. Unlike existing uses of semi-supervised learning, our methodology does not assume the presence of labels from part of the data, but rather, each of the different domains of the data separately undergoes an unsupervised learning process, while sending and receiving supervised information in the form of data constraints to/from the other domains. The entire process is an alternation of semi-supervised learning stages on the different data domains, based on Basu et al.'s Hidden Markov Random Fields (HMRF) variation of the K-means algorithm for semi-supervised clustering that combines the constraint-based and distance-based approaches in a unified model. Our experiments demonstrate a successful mutual semi-supervision between the different domains during clustering, that is superior to the traditional heterogeneous domain clustering baselines consisting of converting the domains to a single domain or clustering each of the domains separately. |
Year | DOI | Venue |
---|---|---|
2012 | 10.1007/978-3-642-34109-0_4 | SPIRE |
Keywords | Field | DocType |
semi-supervised learning stage,mutual semi-supervision,data constraint,different data domain,heterogeneous data,semi-supervised learning framework,semi-supervised clustering,entire process,semi-supervised learning,different domain,unsupervised learning process,clustering data | Fuzzy clustering,Data mining,CURE data clustering algorithm,Data stream clustering,Correlation clustering,Computer science,Constrained clustering,Conceptual clustering,Cluster analysis,Single-linkage clustering | Conference |
Volume | ISSN | Citations |
7608 | 0302-9743 | 2 |
PageRank | References | Authors |
0.37 | 14 | 2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Artur Abdullin | 1 | 10 | 1.88 |
Olfa Nasraoui | 2 | 1515 | 164.53 |