Abstract | ||
---|---|---|
Nonnegative matrix factorization (NMF) has been successfully used as a clustering method especially for flat partitioning of documents. In this paper, we propose an efficient hierarchical document clustering method based on a new algorithm for rank-2 NMF. When the two block coordinate descent framework of nonnegative least squares is applied to computing rank-2 NMF, each subproblem requires a solution for nonnegative least squares with only two columns in the matrix. We design the algorithm for rank-2 NMF by exploiting the fact that an exhaustive search for the optimal active set can be performed extremely fast when solving these NNLS problems. In addition, we design a measure based on the results of rank-2 NMF for determining which leaf node should be further split. On a number of text data sets, our proposed method produces high-quality tree structures in significantly less time compared to other methods such as hierarchical K-means, standard NMF, and latent Dirichlet allocation. |
Year | DOI | Venue |
---|---|---|
2013 | 10.1145/2487575.2487606 | KDD |
Keywords | Field | DocType |
nonnegative matrix factorization,rank-2 nonnegative matrix factorization,descent framework,nnls problem,standard nmf,hierarchical document clustering,rank-2 nmf,hierarchical k-means,efficient hierarchical document,clustering method,new algorithm,active set algorithm | Least squares,Latent Dirichlet allocation,Active set method,Pattern recognition,Computer science,Document clustering,Tree structure,Non-negative matrix factorization,Artificial intelligence,Coordinate descent,Cluster analysis,Machine learning | Conference |
Citations | PageRank | References |
28 | 1.20 | 19 |
Authors | ||
2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Da Kuang | 1 | 119 | 5.90 |
Haesun Park | 2 | 3546 | 232.42 |