Title
Clustering Validity Assessment: Finding the Optimal Partitioning of a Data Set
Abstract
Clustering s a mostly unsupervised procedure and the majority of the clustering algorithms depend on certain assumptions in order to define the subgroups present in a data set. As a consequence, in most applications the resulting clustering scheme requires some sort ofevaluation as regards its validity. In this paper we present a clustering validity procedure,which evaluates the results of clustering algorithms on data sets. We define a validity index, S_Dbw, based on well-defined clustering criteria enabling the selection of the optimal input parameters' values for a clustering algorithm that result in the best partitioning of a data set.We evaluate the reliability of our index both theoretically and experimentally, considering three representative clustering algorithms ran on synthetic and real data sets. Also, we carried out an evaluation study to compare S_Dbw performance with other known validity indices.Our approach performed favorably in all cases, even in those that other indices failed to indicate the correct partitions in a data set.
Year
DOI
Venue
2001
10.1109/ICDM.2001.989517
ICDM
Keywords
DocType
ISBN
unsupervised procedure,clustering scheme,clustering validity assessment,validity index,clustering validity procedure,s_dbw performance,clustering algorithm,known validity index,well-defined clustering criterion,optimal partitioning,clustering algorithms,multidimensional systems,data visualization,reliability theory,informatics,geometry,data mining,indexation,reliability,visual perception
Conference
0-7695-1119-8
Citations 
PageRank 
References 
102
5.14
19
Authors
2
Search Limit
100102
Name
Order
Citations
PageRank
Maria Halkidi1130472.90
Michalis Vazirgiannis23942268.00