Reuse-centric k-means configuration - Citegraph

Paper Info

Title
Reuse-centric k-means configuration

Abstract
K-means configuration is to find a configuration of k-means (e.g., the number of clusters, feature sets) that maximize some objectives. It is a time-consuming process due to the iterative nature of k-means. This paper proposes reuse-centric k-means configuration to accelerate k-means configuration. It is based on the observation that the explorations of different configurations share lots of common or similar computations. Effectively reusing the computations from prior trials of different configurations could largely shorten the configuration time. To materialize the idea, the paper presents a set of novel techniques, including reuse-based filtering, center reuse, and a two-phase design to capitalize on the reuse opportunities on three levels: validation, number of clusters, and feature sets. Experiments on k-means–based data classification tasks show that reuse-centric k-means configuration can speed up a heuristic search-based configuration process by a factor of 5.8, and a uniform search-based attainment of classification error surfaces by a factor of 9.1. The paper meanwhile provides some important insights on how to effectively apply the acceleration techniques to tap into a full potential.

Year	DOI	Venue
2021	10.1016/j.is.2021.101787	Information Systems
Keywords	DocType	Volume
K-means,Algorithm configuration,Computation reuse	Journal	100
ISSN	Citations	PageRank
0306-4379	0	0.34
References	Authors
0	5

Authors (5 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Lijun Zhang	1	0	0.34
Hui Guan	2	0	0.34
Yufei Ding	3	143	23.07
Xipeng Shen	4	2025	118.55
H. Krim	5	594	126.35

1