vqSGD: Vector Quantized Stochastic Gradient Descent - Citegraph

Paper Info

Title
vqSGD: Vector Quantized Stochastic Gradient Descent

Abstract
In this work, we present a family of vector quantization schemes <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">vqSGD</i> (Vector-Quantized Stochastic Gradient Descent) that provide an asymptotic reduction in the communication cost with convergence guarantees in first-order distributed optimization. In the process we derive the following fundamental information theoretic fact: <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$\Theta \left({\frac {d}{R^{2}}}\right)$ </tex-math></inline-formula> bits are necessary and sufficient (up to an additive <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$O(\log d)$ </tex-math></inline-formula> term) to describe an unbiased estimator <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$\hat{\boldsymbol {g}}(\boldsymbol {g})$ </tex-math></inline-formula> for any <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$\boldsymbol {g}$ </tex-math></inline-formula> in the <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$d$ </tex-math></inline-formula> -dimensional unit sphere, under the constraint that <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$\\| \hat{\boldsymbol {g}}(\boldsymbol {g})\\|_{2}\le R$ </tex-math></inline-formula> almost surely, <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$R > 1$ </tex-math></inline-formula> . In particular, we consider a randomized scheme based on the convex hull of a point set, that returns an unbiased estimator of a <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$d$ </tex-math></inline-formula> -dimensional gradient vector with almost surely bounded norm. We provide multiple efficient instances of our scheme, that are near optimal, and require <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$o(d)$ </tex-math></inline-formula> bits of communication at the expense of tolerable increase in error. The instances of our quantization scheme are obtained using well-known families of binary error-correcting codes and provide a smooth tradeoff between the communication and the estimation error of quantization. Furthermore, we show that <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">vqSGD</i> also offers automatic privacy guarantees.

Year	DOI	Venue
2022	10.1109/TIT.2022.3161620	IEEE Transactions on Information Theory
Keywords	DocType	Volume
Vector quantization,communication efficiency,mean estimation,stochastic gradient descent (SGD)	Journal	68
Issue	ISSN	Citations
7	0018-9448	0
PageRank	References	Authors
0.34	11	4

Authors (4 rows)

Cited by (0 rows)

References (11 rows)

Name	Order	Citations	PageRank
Venkata Gandikota	1	0	0.34
Daniel M. Kane	2	743	61.43
Raj Kumar Maity	3	0	1.69
Arya Mazumdar	4	307	41.81

1