Title
High Performance LDA through Collective Model Communication Optimization.
Abstract
LDA is a widely used machine learning technique for big data analysis. The application includes an inference algorithm that iteratively updates a model until it converges. A major challenge is the scaling issue in parallelization owing to the fact that the model size is huge and parallel workers need to communicate the model continually. We identify three important features of the model in parallel LDA computation: 1. The volume of model parameters required for local computation is high; 2. The time complexity of local computation is proportional to the required model size; 3. The model size shrinks as it converges. By investigating collective and asynchronous methods for model communication in different tools, we discover that optimized collective communication can improve the model update speed, thus allowing the model to converge faster. The performance improvement derives not only from accelerated communication but also from reduced iteration computation time as the model size shrinks during the model convergence. To foster faster model convergence, we design new collective communication abstractions and implement two Harp-LDA applications, lgs and rtt. We compare our new approach with Yahoo! LDA and Petuum LDA, two leading implementations favoring asynchronous communication methods in the field, on a 100-node, 4000-thread Intel Haswell cluster. The experiments show that lgs can reach higher model likelihood with shorter or similar execution time compared with Yahoo! LDA, while rtt can run up to 3.9 times faster compared with Petuum LDA when achieving similar model likelihood.
Year
DOI
Venue
2016
10.1016/j.procs.2016.05.300
ICCS
Keywords
Field
DocType
Latent Dirichlet Allocation,Parallel Algorithm,Big Model,Communication Model,Communication Optimization
Convergence (routing),Asynchronous communication,Data mining,Latent Dirichlet allocation,Computer science,Parallel algorithm,Models of communication,Artificial intelligence,Time complexity,Machine learning,Performance improvement,Computation
Conference
Volume
Issue
ISSN
80
C
1877-0509
Citations 
PageRank 
References 
4
0.45
9
Authors
3
Name
Order
Citations
PageRank
Bingjing Zhang152125.17
Bo Peng292.91
Judy Qiu374343.25