Title
Understanding the Acceleration Phenomenon via High-Resolution Differential Equations.
Abstract
Gradient-based optimization algorithms can be studied from the perspective of limiting ordinary differential equations (ODEs). Motivated by the fact that existing ODEs do not distinguish between two fundamentally different algorithms---Nesterovu0027s accelerated method for strongly convex functions (NAG-SC) and Polyaku0027s heavy-ball method---we study an alternative limiting process that yields high-resolution ODEs. We show that these ODEs permit a general Lyapunov function framework for the analysis of convergence in both continuous and discrete time. We also show that these ODEs are more accurate surrogates for the underlying algorithms; in particular, they not only distinguish between NAG-SC and Polyaku0027s heavy-ball method, but they allow the identification of a term that we refer to as gradient correction that is present in NAG-SC but not in the heavy-ball method and is responsible for the qualitative difference in convergence of the two methods. We also use the high-resolution ODE framework to study Nesterovu0027s accelerated method for (non-strongly) convex functions, uncovering a hitherto unknown result---that NAG-C minimizes the squared norm at an inverse cubic rate. Finally, by modifying the high-resolution ODE of NAG-C, we obtain a family of new optimization methods that are shown to maintain the accelerated convergence rates of NAG-C for smooth convex functions.
Year
Venue
Field
2018
arXiv: Optimization and Control
Convergence (routing),Gradient method,Lyapunov function,Differential equation,Applied mathematics,Mathematical optimization,Ordinary differential equation,Convex function,Discrete time and continuous time,Ode,Mathematics
DocType
Volume
Citations 
Journal
abs/1810.08907
1
PageRank 
References 
Authors
0.35
17
4
Name
Order
Citations
PageRank
Bin Shi121.37
Simon Du221029.79
Michael I. Jordan3312203640.80
Weijie Su48611.84