Title
Scalable Exemplar Clustering and Facility Location via Augmented Block Coordinate Descent with Column Generation.
Abstract
In recent years exemplar clustering has become a popular tool for applications in document and video summarization, active learning, and clustering with general similarity, where cluster centroids are required to be a subset of the data samples rather than their linear combinations. The problem is also well-known as facility location in the operations research literature. While the problem has well-developed convex relaxation with approximation and recovery guarantees, its number of variables grows quadratically with the number of samples. Therefore, state-of-the-art methods can hardly handle more than 10(4) samples (i.e. 10(8) variables). In this work, we propose an Augmented-Lagrangian with Block Coordinate Descent (AL-BCD) algorithm that utilizes problem structure to obtain closed-form solution for each block sub-problem, and exploits low-rank representation of the dissimilarity matrix to search active columns without computing the entire matrix. Experiments show our approach to be orders of magnitude faster than existing approaches and can handle problems of up to 10(6) samples. We also demonstrate successful applications of the algorithm on world-scale facility location, document summarization and active learning.
Year
Venue
Field
2016
JMLR Workshop and Conference Proceedings
Linear combination,Automatic summarization,Column generation,Mathematical optimization,Computer science,Facility location problem,Artificial intelligence,Coordinate descent,Cluster analysis,Machine learning,Centroid,Scalability
DocType
Volume
ISSN
Conference
51
1938-7288
Citations 
PageRank 
References 
3
0.40
23
Authors
3
Name
Order
Citations
PageRank
Ian En-Hsu Yen1849.56
Dmitry Malioutov271.88
abhishek kumar394335.34