Title
Fast rank-2 nonnegative matrix factorization for hierarchical document clustering
Abstract
Nonnegative matrix factorization (NMF) has been successfully used as a clustering method especially for flat partitioning of documents. In this paper, we propose an efficient hierarchical document clustering method based on a new algorithm for rank-2 NMF. When the two block coordinate descent framework of nonnegative least squares is applied to computing rank-2 NMF, each subproblem requires a solution for nonnegative least squares with only two columns in the matrix. We design the algorithm for rank-2 NMF by exploiting the fact that an exhaustive search for the optimal active set can be performed extremely fast when solving these NNLS problems. In addition, we design a measure based on the results of rank-2 NMF for determining which leaf node should be further split. On a number of text data sets, our proposed method produces high-quality tree structures in significantly less time compared to other methods such as hierarchical K-means, standard NMF, and latent Dirichlet allocation.
Year
DOI
Venue
2013
10.1145/2487575.2487606
KDD
Keywords
Field
DocType
nonnegative matrix factorization,rank-2 nonnegative matrix factorization,descent framework,nnls problem,standard nmf,hierarchical document clustering,rank-2 nmf,hierarchical k-means,efficient hierarchical document,clustering method,new algorithm,active set algorithm
Least squares,Latent Dirichlet allocation,Active set method,Pattern recognition,Computer science,Document clustering,Tree structure,Non-negative matrix factorization,Artificial intelligence,Coordinate descent,Cluster analysis,Machine learning
Conference
Citations 
PageRank 
References 
28
1.20
19
Authors
2
Name
Order
Citations
PageRank
Da Kuang11195.90
Haesun Park23546232.42