Title
Metric and trigonometric pruning for clustering of uncertain data in 2D geometric space
Abstract
We study the problem of clustering data objects with location uncertainty. In our model, a data object is represented by an uncertainty region over which a probability density function (pdf) is defined. One method to cluster such uncertain objects is to apply the UK-means algorithm [1], an extension of the traditional K-means algorithm, which assigns each object to the cluster whose representative has the smallest expected distance from it. For arbitrary pdf, calculating the expected distance between an object and a cluster representative requires expensive integration of the pdf. We study two pruning methods: pre-computation (PC) and cluster shift (CS) that can significantly reduce the number of integrations computed. Both pruning methods rely on good bounding techniques. We propose and evaluate two such techniques that are based on metric properties (Met) and trigonometry (Tri). Our experimental results show that Tri offers a very high pruning power. In some cases, more than 99.9% of the expected distance calculations are pruned. This results in a very efficient clustering algorithm.
Year
DOI
Venue
2011
10.1016/j.is.2010.09.005
Inf. Syst.
Keywords
Field
DocType
pruning method,uncertain data,arbitrary pdf,data uncertainty,cluster representative,efficient clustering algorithm,clustering data object,expected distance calculation,expected distance,geometric space,cluster shift,trigonometric pruning,data object,clustering,uk-means algorithm,k means algorithm,probability density function
k-medians clustering,Trigonometry,Data mining,Clustering high-dimensional data,Computer science,Uncertain data,Cluster analysis,Probability density function,Pruning,Bounding overwatch
Journal
Volume
Issue
ISSN
36
2
Information Systems
Citations 
PageRank 
References 
5
0.43
34
Authors
7
Name
Order
Citations
PageRank
Wang Kay Ngai12137.43
Ben Kao22358194.98
Reynold Cheng33069154.13
Michael Chau4147197.79
Sau Dan Lee562970.44
David W. Cheung61511156.71
Kevin Y. Yip760038.39