Sharp Analysis for Nonconvex SGD Escaping from Saddle Points. - Citegraph

Paper Info

Title
Sharp Analysis for Nonconvex SGD Escaping from Saddle Points.

Abstract
In this paper, we prove that the simplest Stochastic Gradient Descent (SGD) algorithm is able to efficiently escape from saddle points and find an $(\epsilon, O(\epsilon^{0.5}))$-approximate second-order stationary point in $\tilde{O}(\epsilon^{-3.5})$ stochastic gradient computations for generic nonconvex optimization problems, under both gradient-Lipschitz and Hessian-Lipschitz assumptions. This unexpected result subverts the classical belief that SGD requires at least $O(\epsilon^{-4})$ stochastic gradient computations for obtaining an $(\epsilon, O(\epsilon ^{0.5}))$-approximate second-order stationary point. Such SGD rate matches, up to a polylogarithmic factor of problem-dependent parameters, the rate of most accelerated nonconvex stochastic optimization algorithms that adopt additional techniques, such as Nesterov's momentum acceleration, negative curvature search, as well as quadratic and cubic regularization tricks. Our novel analysis gives new insights into nonconvex SGD and can be potentially generalized to a broad class of stochastic optimization algorithms.

Year	Venue	DocType
2019	conference on learning theory	Conference
Volume	Citations	PageRank
abs/1902.00247	1	0.35
References	Authors
0	3

Authors (3 rows)

Cited by (1 rows)

References (0 rows)

Name	Order	Citations	PageRank
Cong Fang	1	17	7.14
Zhouchen Lin	2	4805	203.69
Zhang, Tong	3	7126	611.43

1