Title | ||
---|---|---|
PLDA+: Parallel latent dirichlet allocation with data placement and pipeline processing |
Abstract | ||
---|---|---|
Previous methods of distributed Gibbs sampling for LDA run into either memory or communication bottlenecks. To improve scalability, we propose four strategies: data placement, pipeline processing, word bundling, and priority-based scheduling. Experiments show that our strategies significantly reduce the unparallelizable communication bottleneck and achieve good load balancing, and hence improve scalability of LDA. |
Year | DOI | Venue |
---|---|---|
2011 | 10.1145/1961189.1961198 | ACM TIST |
Keywords | Field | DocType |
pipeline processing,communication bottleneck,latent dirichlet allocation,unparallelizable communication bottleneck,gibbs sampling,good load balancing,parallel latent dirichlet allocation,previous method,priority-based scheduling,data placement,lda run,distributed parallel computations,topic models,load balance | Bottleneck,Latent Dirichlet allocation,Load balancing (computing),Computer science,Scheduling (computing),Parallel computing,Topic model,Gibbs sampling,Scalability,Distributed computing | Journal |
Volume | Issue | ISSN |
2 | 3 | 2157-6904 |
Citations | PageRank | References |
83 | 2.33 | 16 |
Authors | ||
4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Zhiyuan Liu | 1 | 2037 | 123.68 |
Yuzhou Zhang | 2 | 150 | 7.63 |
Edward Y. Chang | 3 | 4519 | 336.59 |
Maosong Sun | 4 | 2293 | 162.86 |