Title
A Data Set Oriented Approach for Clustering Algorithm Selection
Abstract
In the last years the availability of huge transactional and experimental data sets and the arising requirements for data mining created needs for clustering algorithms that scale and can be applied in diverse domains. Thus, a variety of algorithms have been proposed which have application in different fields and may result in different partitioning of a data set, depending on the specific clustering criterion used. Moreover, since clustering is an unsupervised process, most of the algorithms are based on assumptions in order to define a partitioning of a data set. It is then obvious that in most applications the final clustering scheme requires some sort of evaluation. In this paper we present a clustering validity procedure, which taking in account the inherent features of a data set evaluates the results of different clustering algorithms applied to it. A validity index, S_Dbw, is defined according to well-known clustering criteria so as to enable the selection of the algorithm providing the best partitioning of a data set. We evaluate the reliability of our approach both theoretically and experimentally, considering three representative clustering algorithms ran on synthetic and real data sets. It performed favorably in all studies, giving an indication of the algorithm that is suitable for the considered application.
Year
Venue
Keywords
2001
PKDD
specific clustering criterion,different clustering,well-known clustering criterion,clustering algorithm selection,best partitioning,data set oriented approach,clustering validity procedure,clustering algorithm,experimental data set,final clustering scheme,indexation,data mining
Field
DocType
ISBN
Fuzzy clustering,Data mining,Canopy clustering algorithm,CURE data clustering algorithm,Data stream clustering,Correlation clustering,Computer science,Determining the number of clusters in a data set,Constrained clustering,Artificial intelligence,Cluster analysis,Machine learning
Conference
3-540-42534-9
Citations 
PageRank 
References 
12
1.21
11
Authors
2
Name
Order
Citations
PageRank
Maria Halkidi1130472.90
Michalis Vazirgiannis23942268.00