Title
Tight lower bound instances for k-means++ in two dimensions
Abstract
The k-means++ seeding algorithm is one of the most popular algorithms that is used for finding the initial k centers when using the Lloyd's algorithm for the k-means problem. It was conjectured by Brunsch and Röglin [9] that k-means++ behaves well for datasets with small dimension. More specifically, they conjectured that the k-means++ seeding algorithm gives O(log⁡d) approximation with high probability for any d-dimensional dataset. In this work, we refute this conjecture by giving two dimensional datasets on which the k-means++ seeding algorithm achieves an O(log⁡k) approximation ratio with probability exponentially small in k. This solves open problems posed by Mahajan et al. [12] and by Brunsch and Röglin [9].
Year
DOI
Venue
2016
10.1016/j.tcs.2016.04.012
Theoretical Computer Science
Keywords
Field
DocType
k-means++,Lower bounds
Discrete mathematics,k-means clustering,Combinatorics,Upper and lower bounds,Conjecture,Seeding,Mathematics
Journal
Volume
Issue
ISSN
634
C
0304-3975
Citations 
PageRank 
References 
0
0.34
10
Authors
3
Name
Order
Citations
PageRank
Anup Bhattacharya1104.27
Ragesh Jaiswal222018.33
Nir Ailon3111470.74