High Performance LDA through Collective Model Communication Optimization. - Citegraph

Paper Info

Title
High Performance LDA through Collective Model Communication Optimization.

Abstract
LDA is a widely used machine learning technique for big data analysis. The application includes an inference algorithm that iteratively updates a model until it converges. A major challenge is the scaling issue in parallelization owing to the fact that the model size is huge and parallel workers need to communicate the model continually. We identify three important features of the model in parallel LDA computation: 1. The volume of model parameters required for local computation is high; 2. The time complexity of local computation is proportional to the required model size; 3. The model size shrinks as it converges. By investigating collective and asynchronous methods for model communication in different tools, we discover that optimized collective communication can improve the model update speed, thus allowing the model to converge faster. The performance improvement derives not only from accelerated communication but also from reduced iteration computation time as the model size shrinks during the model convergence. To foster faster model convergence, we design new collective communication abstractions and implement two Harp-LDA applications, lgs and rtt. We compare our new approach with Yahoo! LDA and Petuum LDA, two leading implementations favoring asynchronous communication methods in the field, on a 100-node, 4000-thread Intel Haswell cluster. The experiments show that lgs can reach higher model likelihood with shorter or similar execution time compared with Yahoo! LDA, while rtt can run up to 3.9 times faster compared with Petuum LDA when achieving similar model likelihood.

Year	DOI	Venue
2016	10.1016/j.procs.2016.05.300	ICCS
Keywords	Field	DocType
Latent Dirichlet Allocation,Parallel Algorithm,Big Model,Communication Model,Communication Optimization	Convergence (routing),Asynchronous communication,Data mining,Latent Dirichlet allocation,Computer science,Parallel algorithm,Models of communication,Artificial intelligence,Time complexity,Machine learning,Performance improvement,Computation	Conference
Volume	Issue	ISSN
80	C	1877-0509
Citations	PageRank	References
4	0.45	9
Authors
3

Authors (3 rows)

Cited by (4 rows)

References (9 rows)

Name	Order	Citations	PageRank
Bingjing Zhang	1	521	25.17
Bo Peng	2	9	2.91
Judy Qiu	3	743	43.25

1