Abstract | ||
---|---|---|
The purpose of this paper is to develop new efficient approaches based on DC (Difference of Convex functions) programming and DCA (DC Algorithm) to perform clustering via minimum sum-of-squares Euclidean distance. We consider the two most widely used models for the so-called Minimum Sum-of-Squares Clustering (MSSC in short) that are a bilevel programming problem and a mixed integer program. Firstly, the mixed integer formulation of MSSC is carefully studied and is reformulated as a continuous optimization problem via a new result on exact penalty technique in DC programming. DCA is then investigated to the resulting problem. Secondly, we introduce a Gaussian kernel version of the bilevel programming formulation of MSSC, named GKMSSC. The GKMSSC problem is formulated as a DC program for which a simple and efficient DCA scheme is developed. A regularization technique is investigated for exploiting the nice effect of DC decomposition and a simple procedure for finding good starting points of DCA is developed. The proposed DCA schemes are original and very inexpensive because they amount to computing, at each iteration, the projection of points onto a simplex and/or onto a ball, and/or onto a box, which are all determined in the explicit form. Numerical results on real word datasets show the efficiency, the scalability of DCA and its great superiority with respect to k-means and kernel k-means, standard methods for clustering. |
Year | DOI | Venue |
---|---|---|
2014 | 10.1016/j.patcog.2013.07.012 | Pattern Recognition |
Keywords | Field | DocType |
dc decomposition,continuous optimization problem,dc program,bilevel programming problem,bilevel programming formulation,gkmssc problem,minimum sum-of-squares,dc programming,dc algorithm,efficient dca scheme,proposed dca scheme,combinatorial optimization,clustering,gaussian kernel | Artificial intelligence,Cluster analysis,Gaussian function,Kernel (linear algebra),Mathematical optimization,Bilevel optimization,Euclidean distance,Algorithm,Combinatorial optimization,Convex function,Explained sum of squares,Machine learning,Mathematics | Journal |
Volume | Issue | ISSN |
47 | 1 | 0031-3203 |
Citations | PageRank | References |
21 | 0.71 | 29 |
Authors | ||
3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Le Thi Hoai An | 1 | 1038 | 80.20 |
Le Hoai Minh | 2 | 131 | 7.91 |
Pham Dinh Tao | 3 | 1340 | 104.84 |