Title
Sub-Sampled Newton Methods I: Globally Convergent Algorithms.
Abstract
Large scale optimization problems are ubiquitous in machine learning and data analysis and there is a plethora of algorithms for solving such problems. Many of these algorithms employ sub-sampling, as a way to either speed up the computations and/or to implicitly implement a form of statistical regularization. In this paper, we consider second-order iterative optimization algorithms and we provide bounds on the convergence of the variants of Newtonu0027s method that incorporate uniform sub-sampling as a means to estimate the gradient and/or Hessian. Our bounds are non-asymptotic and quantitative. Our algorithms are global and are guaranteed to converge from any initial iterate. Using random matrix concentration inequalities, one can sub-sample the Hessian to preserve the curvature information. Our first algorithm incorporates Hessian sub-sampling while using the full gradient. We also give additional convergence results for when the sub-sampled Hessian is regularized by modifying its spectrum or ridge-type regularization. Next, in addition to Hessian sub-sampling, we also consider sub-sampling the gradient as a way to further reduce the computational complexity per iteration. We use approximate matrix multiplication results from randomized numerical linear algebra to obtain the proper sampling strategy. In all these algorithms, computing the update boils down to solving a large scale linear system, which can be computationally expensive. As a remedy, for all of our algorithms, we also give global convergence results for the case of inexact updates where such linear system is solved only approximately. This paper has a more advanced companion paper, [42], in which we demonstrate that, by doing a finer-grained analysis, we can get problem-independent bounds for local convergence of these algorithms and explore trade-offs to improve upon the basic results of the present paper.
Year
Venue
Field
2016
arXiv: Optimization and Control
Discrete mathematics,Mathematical optimization,Quasi-Newton method,Linear system,Algorithm,Hessian matrix,Local convergence,Matrix multiplication,Optimization problem,Numerical linear algebra,Mathematics,Computational complexity theory
DocType
Volume
Citations 
Journal
abs/1601.04737
17
PageRank 
References 
Authors
0.62
22
2
Name
Order
Citations
PageRank
Farbod Roosta-Khorasani11029.25
Michael W. Mahoney23297218.10