Make Workers Work Harder: Decoupled Asynchronous Proximal Stochastic Gradient Descent. - Citegraph

Paper Info

Title
Make Workers Work Harder: Decoupled Asynchronous Proximal Stochastic Gradient Descent.

Abstract
Asynchronous parallel optimization algorithms for solving large-scale machine learning problems have drawn significant attention from academia to industry recently. This paper proposes a novel algorithm, decoupled asynchronous proximal stochastic gradient descent (DAP-SGD), to minimize an objective function that is the composite of the average of multiple empirical losses and a regularization term. Unlike the traditional asynchronous proximal stochastic gradient descent (TAP-SGD) in which the master carries much of the computation load, the proposed algorithm off-loads the majority of computation tasks from the master to workers, and leaves the master to conduct simple addition operations. This strategy yields an easy-to-parallelize algorithm, whose performance is justified by theoretical convergence analyses. To be specific, DAP-SGD achieves an $O(log T/T)$ rate when the step-size is diminishing and an ergodic $O(1/sqrt{T})$ rate when the step-size is constant, where $T$ is the number of total iterations.

Year	Venue	Field
2016	arXiv: Optimization and Control	Convergence (routing),Asynchronous communication,Mathematical optimization,Stochastic gradient descent,Parallel optimization,Ergodic theory,Algorithm,Regularization (mathematics),Mathematics,Computation
DocType	Volume	Citations
Journal	abs/1605.06619	1
PageRank	References	Authors
0.36	8	4

Authors (4 rows)

Cited by (1 rows)

References (8 rows)

Name	Order	Citations	PageRank
Yitan Li	1	32	3.11
Linli Xu	2	790	42.51
Xiaowei Zhong	3	1	0.36
Qing Ling	4	968	60.48

1