Title
Issues in clustering algorithm consistency in fixed dimensional spaces. Some solutions for k-means
Abstract
Kleinberg introduced an axiomatic system for clustering functions. Out of three axioms, he proposed, two (scale invariance and consistency) are concerned with data transformations that should produce the same clustering under the same clustering function. The so-called consistency axiom provides the broadest range of transformations of the data set. Kleinberg claims that one of the most popular clustering algorithms, k-means does not have the property of consistency. We challenge this claim by pointing at invalid assumptions of his proof (infinite dimensionality) and show that in one dimension in Euclidean space the k-means algorithm has the consistency property. We also prove that in higher dimensional space, k-means is, in fact, inconsistent. This result is of practical importance when choosing testbeds for implementation of clustering algorithms while it tells under which circumstances clustering after consistency transformation shall return the same clusters. Two types of remedy are proposed: gravitational consistency property and dataset consistency property which both hold for k-means and hence are suitable when developing the mentioned testbeds.
Year
DOI
Venue
2021
10.1007/s10844-021-00657-6
JOURNAL OF INTELLIGENT INFORMATION SYSTEMS
Keywords
DocType
Volume
Cluster analysis, Consistency property, Gravitational consistency property, Fixed dimensional Euclidean space consistency, k-Means algorithm
Journal
57
Issue
ISSN
Citations 
3
0925-9902
0
PageRank 
References 
Authors
0.34
0
2
Name
Order
Citations
PageRank
Mieczyslaw A. Klopotek136678.58
Robert Klopotek230.75