Fitting Relus Via Sgd And Quantized Sgd - Citegraph

Paper Info

Title
Fitting Relus Via Sgd And Quantized Sgd

Abstract
In this paper we focus on the problem of finding the optimal weights of the shallowest of neural networks consisting of a single Rectified Linear Unit (ReLU). These functions are of the form x -> max(0, < w, xi) with w is an element of R-d denoting the weight vector. We focus on a planted model where the inputs are chosen i.i.d. from a Gaussian distribution and the labels are generated according to a planted weight vector. We first show that mini-batch stochastic gradient descent when suitably initialized, converges at a geometric rate to the planted model with a number of samples that is optimal up to numerical constants. Next we focus on a parallel implementation where in each iteration the mini-batch gradient is calculated in a distributed manner across multiple processors and then broadcast to a master or all other processors. To reduce the communication cost in this setting we utilize a Quanitzed Stochastic Gradient Scheme (QSGD) where the partial gradients are quantized. Perhaps unexpectedly, we show that QSGD maintains the fast convergence of SGD to a globally optimal model while significantly reducing the communication cost. We further corroborate our numerical findings via various experiments including distributed implementations over Amazon EC2.

Year	DOI	Venue
2019	10.1109/ISIT.2019.8849667	2019 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY (ISIT)
Field	DocType	Volume
Convergence (routing),Discrete mathematics,Stochastic gradient descent,Mathematical optimization,Rectifier (neural networks),Weight,Gaussian,Quantization (physics),Artificial neural network,Mathematics,Exponential growth	Journal	abs/1901.06587
Citations	PageRank	References
0	0.34	15
Authors
3

Authors (3 rows)

Cited by (0 rows)

References (15 rows)

Name	Order	Citations	PageRank
Seyed Mohammadreza Mousavi Kalan	1	14	1.99
Mahdi Soltanolkotabi	2	409	25.97
Amir Salman Avestimehr	3	1880	157.39

1