Distributed optimization for degenerate loss functions arising from over-parameterization - Citegraph

Paper Info

Title
Distributed optimization for degenerate loss functions arising from over-parameterization

Abstract
We consider distributed optimization with degenerate loss functions, where the optimal sets of local loss functions have a non-empty intersection. This regime often arises in optimizing large-scale multi-agent AI systems (e.g., deep learning systems), where the number of trainable weights far exceeds the number of training samples, leading to highly degenerate loss surfaces. Under appropriate conditions, we prove that distributed gradient descent in this case converges even when communication is arbitrarily less frequent, which is not the case for non-degenerate loss functions. Moreover, we quantitatively analyze the convergence rate, as well as the communication and computation trade-off, providing insights into designing efficient distributed optimization algorithms. Our theoretical findings are confirmed by both distributed convex optimization and deep learning experiments.

Year	DOI	Venue
2021	10.1016/j.artint.2021.103575	Artificial Intelligence
Keywords	DocType	Volume
Distributed optimization,Over-parameterization,Deep learning	Journal	301
Issue	ISSN	Citations
1	0004-3702	0
PageRank	References	Authors
0.34	0	2

Authors (2 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Chi Zhang	1	6	1.78
Qianxiao Li	2	0	1.01

1