Abstract | ||
---|---|---|
We propose a new method, called SimClus, for clustering with lower bound on similarity. Instead of accepting k the number of clusters to find, the alternative similarity-based approach imposes a lower bound on the similarity between an object and its corresponding cluster representative (with one representative per cluster). SimClus achieves a O (logn ) approximation bound on the number of clusters, whereas for the best previous algorithm the bound can be as poor as O (n ). Experiments on real and synthetic datasets show that our algorithm produces more than 40% fewer representative objects, yet offers the same or better clustering quality. We also propose a dynamic variant of the algorithm, which can be effectively used in an on-line setting. |
Year | DOI | Venue |
---|---|---|
2009 | 10.1007/978-3-642-01307-2_14 | PAKDD |
Keywords | DocType | Volume |
alternative similarity-based approach,Lower Bound,corresponding cluster representative,new method,dynamic variant,previous algorithm,synthetic datasets,on-line setting,clustering quality,fewer representative object | Conference | 5476 |
ISSN | Citations | PageRank |
0302-9743 | 2 | 0.44 |
References | Authors | |
7 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Mohammad Al Hasan | 1 | 427 | 35.08 |
Saeed Salem | 2 | 182 | 17.39 |
Benjarath Pupacdi | 3 | 2 | 0.44 |
Mohammed Javeed Zaki | 4 | 7972 | 536.24 |