Abstract | ||
---|---|---|
We propose new continuous-time formulations for first-order stochastic optimization algorithms such as mini-batch gradient descent and variance-reduced methods. We exploit these continuous-time models, together with simple Lyapunov analysis as well as tools from stochastic calculus, in order to derive convergence bounds for various types of non-convex functions. Guided by such analysis, we show that the same Lyapunov arguments hold in discrete-time, leading to matching rates. In addition, we use these models and Ito calculus to infer novel insights on the dynamics of SGD, proving that a decreasing learning rate acts as time warping or, equivalently, as landscape stretching. |
Year | Venue | Keywords |
---|---|---|
2018 | ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019) | ito calculus,stochastic calculus |
Field | DocType | Volume |
Convergence (routing),Lyapunov function,Itō calculus,Gradient descent,Mathematical optimization,Stochastic optimization,Dynamic time warping,Stochastic calculus,Algorithm,Mathematics | Journal | 32 |
ISSN | Citations | PageRank |
1049-5258 | 0 | 0.34 |
References | Authors | |
13 | 2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Orvieto, Antonio | 1 | 0 | 3.04 |
Aurelien Lucchi | 2 | 2419 | 89.45 |