Abstract | ||
---|---|---|
A new geometrically-motivated algorithm for topic modeling is developed and applied to the discovery of latent "topics" in text and image "document" corpora. The algorithm is based on robustly finding and clustering extreme-points of empirical cross-document word-frequencies that correspond to novel words unique to each topic. In contrast to related approaches that are based on solving non-convex optimization problems using suboptimal approximations, locally-optimal methods, or heuristics, the new algorithm is convex, has polynomial complexity, and has competitive qualitative and quantitative performance compared to the current state-of-the-art approaches on synthetic and real-world datasets. |
Year | DOI | Venue |
---|---|---|
2013 | 10.1109/ICASSP.2013.6638729 | 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) |
Keywords | Field | DocType |
Topic modeling, nonnegative matrix factorization (NMF), extreme points, subspace clustering | Data mining,Computer science,Document clustering,Heuristics,Polynomial complexity,Artificial intelligence,Cluster analysis,Optimization problem,Pattern recognition,Approximation theory,Regular polygon,Topic model,Machine learning | Conference |
ISSN | Citations | PageRank |
1520-6149 | 3 | 0.40 |
References | Authors | |
9 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Weicong Ding | 1 | 33 | 2.82 |
Mohammad H. Rohban | 2 | 57 | 5.28 |
Prakash Ishwar | 3 | 951 | 67.13 |
Venkatesh Saligrama | 4 | 1350 | 112.74 |