Title
Distributed Second-order Convex Optimization.
Abstract
Convex optimization problems arise frequently in diverse machine learning (ML) applications. First-order methods, i.e., those that solely rely on the gradient information, are most commonly used to solve these problems. This choice is motivated by their simplicity and low per-iteration cost. Second-order methods that rely on curvature information through the dense Hessian matrix have, thus far, proven to be prohibitively expensive at scale, both in terms of computational and memory requirements. We present a novel multi-GPU distributed formulation of a second order (Newton-type) solver for convex finite sum minimization problems for multi-class classification. Our distributed formulation relies on the Alternating Direction of Multipliers Method (ADMM), which requires only one round of communication per-iteration -- significantly reducing communication overheads, while incurring minimal convergence overhead. By leveraging the computational capabilities of GPUs, we demonstrate that per-iteration costs of Newton-type methods can be significantly reduced to be on-par with, if not better than, state-of-the-art first-order alternatives. Given their significantly faster convergence rates, we demonstrate that our methods can process large data-sets in much shorter time (orders of magnitude in many cases) compared to existing first and second order methods, while yielding similar test-accuracy results.
Year
Venue
Field
2018
arXiv: Learning
Convergence (routing),Orders of magnitude (numbers),Mathematical optimization,Curvature,Hessian matrix,Regular polygon,Minification,Solver,Convex optimization,Mathematics
DocType
Volume
Citations 
Journal
abs/1807.07132
1
PageRank 
References 
Authors
0.36
0
5
Name
Order
Citations
PageRank
Chih-Hao Fang110.36
S. B. Kylasa2121.94
Farbod Roosta-Khorasani31029.25
Michael W. Mahoney43297218.10
Ananth Grama51812136.25