Title
Validity indices for clusters of uncertain data objects
Abstract
Clustering validity indices are the main tools for evaluating the quality of formed clusters and determining the correct number of clusters. They can be applied on the results of clustering algorithms to validate the performance of those algorithms. In this paper, two clustering validity indices named uncertain Silhouette and Order Statistic, are developed for uncertain data. To the best of our knowledge, there is not any clustering validity index in the literature that is designed for uncertain objects and can be used for validating the performance of uncertain clustering algorithms. Our proposed validity indices use probabilistic distance measures to capture the distance between uncertain objects. They outperform existing validity indices for certain data in validating clusters of uncertain data objects and are robust to outliers. The Order Statistic index in particular, a general form of uncertain Dunn validity index (also developed here), is well capable of handling instances where there is a single cluster that is relatively scattered (not compact) compared to other clusters, or there are two clusters that are close (not well-separated) compared to other clusters. The aforementioned instances can potentially result in the failure of existing clustering validity indices in detecting the correct number of clusters.
Year
DOI
Venue
2021
10.1007/s10479-018-3043-4
Annals of Operations Research
Keywords
DocType
Volume
Clustering validity index, Uncertain data, Probabilistic distance measures, Data mining
Journal
303
Issue
ISSN
Citations 
1
1572-9338
1
PageRank 
References 
Authors
0.36
14
3
Name
Order
Citations
PageRank
Behnam Tavakkol131.76
Myong K Jeong242433.06
Susan L. Albin3314.39