Abstract | ||
---|---|---|
During the past decades, many spectral clustering algorithms have been proposed. However, their high computational complexities hinder their applications on large-scale data. Moreover, most of them use a two-step approach to obtain the optimal solution, which may deviate from the solution by directly solving the original problem. In this paper, we propose a new optimization algorithm, namely Direct Normalized Cut (DNC), to directly optimize the normalized cut model. DNC has a quadratic time complexity, which is a significant reduction comparing with the cubic time complexity of the traditional spectral clustering. To cope with large-scale data, a Fast Normalized Cut (FNC) method with linear time and space complexities is proposed by extending DNC with an anchor-based strategy. In the new method, we first seek a set of anchors and then construct a representative similarity matrix by computing distances between the anchors and the whole data set. To find high quality anchors that best represent the whole data set, we propose a Balanced k-means (BKM) to partition a data set into balanced clusters and use the cluster centers as anchors. Then DNC is used to obtain the final clustering result from the representative similarity matrix. A series of experiments were conducted on both synthetic data and real-world data sets, and the experimental results show the superior performance of BKM, DNC and FNC.
|
Year | DOI | Venue |
---|---|---|
2018 | 10.1145/3219819.3220039 | KDD |
Keywords | Field | DocType |
Clustering,normalized cut,large-scale data | Cluster (physics),Data mining,Spectral clustering,Data set,Normalization (statistics),Computer science,Algorithm,Synthetic data,Partition (number theory),Cluster analysis,Time complexity | Conference |
ISBN | Citations | PageRank |
978-1-4503-5552-0 | 12 | 0.50 |
References | Authors | |
18 | 6 |
Name | Order | Citations | PageRank |
---|---|---|---|
Xiaojun Chen | 1 | 1298 | 107.51 |
Weijun Hong | 2 | 15 | 0.88 |
Feiping Nie | 3 | 7061 | 309.42 |
Dan He | 4 | 14 | 0.88 |
Min Yang | 5 | 155 | 41.56 |
Joshua Zhexue Huang | 6 | 1365 | 82.64 |