Abstract | ||
---|---|---|
The clustering over various granularities for high dimensional data in arbitrary shape is a challenge in data mining. In this paper Nearest Neighbors Absorbed First (NNAF) clustering algorithm is proposed to solve the problem based on the idea that the objects in the same cluster must be near. The main contribution includes: (1) A theorem of searching nearest neighbors (SNN) is proved. Based on it, SNN algorithms are proposed with time complexity O(n*log(n)) or O(n). They are much faster than the traditional searching nearest neighbors algorithm with O(n2). (2)The clustering algorithm of NNAF to process high dimensional data with arbitrary shape is proposed with time complexity O(n). The experiments show that the new algorithms can process efficiently high dimensional data in arbitrary shape with noisy. They can produce clustering over various granularities quickly with little domain knowledge. |
Year | DOI | Venue |
---|---|---|
2005 | 10.1007/11563952_67 | WAIM |
Keywords | Field | DocType |
various granularity,data mining,snn algorithm,time complexity o,nearest neighbors algorithm,new algorithm,nearest neighbor,clustering algorithm,high dimensional data,arbitrary shape,domain knowledge,time complexity | Canopy clustering algorithm,Data mining,Clustering high-dimensional data,CURE data clustering algorithm,Data stream clustering,Search algorithm,Correlation clustering,Computer science,Artificial intelligence,Nearest-neighbor chain algorithm,Cluster analysis,Machine learning | Conference |
Volume | ISSN | ISBN |
3739 | 0302-9743 | 3-540-29227-6 |
Citations | PageRank | References |
2 | 0.38 | 8 |
Authors | ||
6 |
Name | Order | Citations | PageRank |
---|---|---|---|
Jianjun Hu | 1 | 9 | 2.36 |
Changjie Tang | 2 | 483 | 62.75 |
Jing Peng | 3 | 16 | 3.00 |
Chuan Li | 4 | 49 | 5.32 |
Chang-an Yuan | 5 | 85 | 9.88 |
An-Long Chen | 6 | 61 | 4.65 |