Title
Nonlinear Acceleration of Deep Neural Networks.
Abstract
Regularized nonlinear acceleration (RNA) is a generic extrapolation scheme for optimization methods, with marginal computational overhead. It aims to improve convergence using only the iterates of simple iterative algorithms. However, so far its application to optimization was theoretically limited to gradient descent and other single-step algorithms. Here, we adapt RNA to a much broader setting including stochastic gradient with momentum and Nesterovu0027s fast gradient. We use it to train deep neural networks, and empirically observe that extrapolated networks are more accurate, especially in the early iterations. A straightforward application of our algorithm when training ResNet-152 on ImageNet produces a top-1 test error of 20.88%, improving by 0.8% the reference classification pipeline. Furthermore, the code runs offline in this case, so it never negatively affects performance.
Year
Venue
Field
2018
arXiv: Optimization and Control
Convergence (routing),Overhead (computing),Mathematical optimization,Gradient descent,Nonlinear system,Algorithm,Extrapolation,Momentum,Acceleration,Iterated function,Mathematics
DocType
Volume
Citations 
Journal
abs/1805.09639
1
PageRank 
References 
Authors
0.41
5
4
Name
Order
Citations
PageRank
Damien Scieur111.09
Edouard Oyallon2173.64
Alexandre D'aspremont3991101.61
Francis Bach411490622.29