Abstract | ||
---|---|---|
Spherical k-means is a widely used clustering algorithm for sparse and high-dimensional data such as document vectors. While several improvements and accelerations have been introduced for the original k-means algorithm, not all easily translate to the spherical variant: Many acceleration techniques, such as the algorithms of Elkan and Hamerly, rely on the triangle inequality of Euclidean distances. However, spherical k-means uses cosine similarities instead of distances for computational efficiency. In this paper, we incorporate the Elkan and Hamerly accelerations to the spherical k-means algorithm working directly with the cosines instead of Euclidean distances to obtain a substantial speedup and evaluate these spherical accelerations on real data. |
Year | DOI | Venue |
---|---|---|
2021 | 10.1007/978-3-030-89657-7_17 | SIMILARITY SEARCH AND APPLICATIONS, SISAP 2021 |
DocType | Volume | ISSN |
Conference | 13058 | 0302-9743 |
Citations | PageRank | References |
0 | 0.34 | 0 |
Authors | ||
3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Erich Schubert | 1 | 0 | 1.35 |
Andreas Lang | 2 | 13 | 5.02 |
Gloria Feher | 3 | 0 | 0.34 |