Efficient Algorithms for Constrained Clustering with Side Information. - Citegraph

Paper Info

Title
Efficient Algorithms for Constrained Clustering with Side Information.

Abstract
Clustering as an unsupervised machine learning method has broad applications within the area of data science and natural language processing. In this paper, we use background knowledge or side information of the data as constraints to improve clustering accuracy. Following the representation method as in [15], we first format the side information as must-link set and cannot-link set. Then we propose a constrained k-means algorithm for clustering the data. The key idea of our algorithm for clustering must-link data sets is to treat each set as a data with large volume, which is, to assign a set of must-link data as a whole to the center closest to its mass center. In contrast, the key for clustering cannot-link data set is to transform the assignment of the involved data points to the computation of a minimum weight perfect matching. At last, we carried out numerical simulation to evaluate our algorithms for constrained k-means on UCI datasets. The experimental results demonstrate that our method outperforms the previous constrained k-means as well as the classical k-means in both clustering accuracy and runtime.

Year	DOI	Venue
2019	10.1007/978-981-15-2767-8_25	PAAP
Field	DocType	Citations
Data point,Data set,Computer science,Algorithm,Matching (graph theory),Unsupervised learning,Minimum weight,Constrained clustering,Cluster analysis,Computation	Conference	0
PageRank	References	Authors
0.34	0	5

Authors (5 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Zhendong Hao	1	0	0.34
Longkun Guo	2	6	5.49
Pei Yao	3	0	1.01
Peihuang Huang	4	0	0.68
Huihong Peng	5	0	0.34

1