Abstract | ||
---|---|---|
We present a novel approach for LDA (Latent Dirichlet Allocation) topic reconstruction. The main technical idea is to show that the distribution over the documents generated by LDA can be transformed into a distribution for a much simpler generative model in which documents are generated from the same set of topics but have a much simpler structure: documents are single topic and topics are chosen uniformly at random. Furthermore, this reduction is approximation preserving, in the sense that approximate distributions - the only ones we can hope to compute in practice - are mapped into approximate distribution in the simplified world. This opens up the possibility of efficiently reconstructing LDA topics in a roundabout way. Compute an approximate document distribution from the given corpus, transform it into an approximate distribution for the single-topic world, and run a reconstruction algorithm in the uniform, single-topic world - a much simpler task than direct LDA reconstruction. We show the viability of the approach by giving very simple algorithms for a generalization of two notable cases that have been studied in the literature, p-separability and matrix-like topics. |
Year | Venue | Keywords |
---|---|---|
2018 | ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018) | generative model,gibbs sampling,latent dirichlet allocation,reconstruction algorithm |
Field | DocType | Volume |
Latent Dirichlet allocation,Mathematical optimization,Computer science,Algorithm,Reconstruction algorithm,SIMPLE algorithm,Gibbs sampling,Generative model | Conference | 31 |
ISSN | Citations | PageRank |
1049-5258 | 0 | 0.34 |
References | Authors | |
0 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Almanza, Matteo | 1 | 0 | 1.01 |
Flavio Chierichetti | 2 | 626 | 39.42 |
Alessandro Panconesi | 3 | 1584 | 124.00 |
Andrea Vattani | 4 | 171 | 11.45 |