Distributed SGD Generalizes Well Under Asynchrony - Citegraph

Paper Info

Title
Distributed SGD Generalizes Well Under Asynchrony

Abstract
The performance of fully synchronized distributed systems has faced a bottleneck due to the big data trend, under which asynchronous distributed systems are becoming a major popularity due to their powerful scalability. In this paper, we study the generalization performance of stochastic gradient descent (SGD) on a distributed asynchronous system. The system consists of multiple worker machines that compute stochastic gradients which are further sent to and aggregated on a common parameter server to update the variables, and the communication in the system suffers from possible delays. Under the algorithm stability framework, we prove that distributed asynchronous SGD generalizes well given enough data samples in the training optimization. In particular, our results suggest to reduce the learning rate as we allow more asynchrony in the distributed system. Such adaptive learning rate strategy improves the stability of the distributed algorithm and reduces the corresponding generalization error. Then, we confirm our theoretical findings via numerical experiments.

Year	DOI	Venue
2019	10.1109/ALLERTON.2019.8919791	2019 57TH ANNUAL ALLERTON CONFERENCE ON COMMUNICATION, CONTROL, AND COMPUTING (ALLERTON)
Field	DocType	ISSN
Asynchronous communication,Bottleneck,Synchronization,Stochastic gradient descent,Asynchronous system,Computer science,Server,Distributed algorithm,Scalability,Distributed computing	Conference	2474-0195
Citations	PageRank	References
0	0.34	0
Authors
5

Authors (5 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Jayanth Regatti	1	0	0.34
Gaurav Tendolkar	2	0	0.34
Yi Zhou	3	65	17.55
Abhishek Gupta	4	0	0.34
Yingbin Liang	5	1646	147.64

1