Title
Fast Approximation Of The Gauss--Newton Hessian Matrix For The Multilayer Perceptron
Abstract
We introduce a fast algorithm for entrywise evaluation of the Gauss-Newton Hessian (GNH) matrix for the fully connected feed-forward neural network. The algorithm has a precomputation step and a sampling step. While it generally requires O(Nn) work to compute an entry (and the entire column) in the GNH matrix for a neural network with N parameters and n data points, our fast sampling algorithm reduces the cost to O(n+ d/ is an element of(2)) work, where d is the output dimension of the network and epsilon is a prescribed accuracy (independent of N). One application of our algorithm is constructing the hierarchical-matrix (H -matrix) approximation of the GNH matrix for solving linear systems and eigenvalue problems. It generally requires O (N-2) memory and O(N-3) work to store and factorize the GNH matrix, respectively. The H-matrix approximation requires only O(Nr(o)) memory footprint and O (Nr(o)(2)) work to be factorized, where r(o) << N is the maximum rank of off-diagonal blocks in the GNH matrix. We demonstrate the performance of our fast algorithm and the H -matrix approximation on classification and autoencoder neural networks.
Year
DOI
Venue
2021
10.1137/19M129961X
SIAM JOURNAL ON MATRIX ANALYSIS AND APPLICATIONS
Keywords
DocType
Volume
Gauss-Newton Hessian, fast Monte Carlo sampling, hierarchical matrix, second-order optimization, multilayer perceptron
Journal
42
Issue
ISSN
Citations 
1
0895-4798
1
PageRank 
References 
Authors
0.35
0
5
Name
Order
Citations
PageRank
Chao Chen12032185.26
Severin Reiz210.35
Chenhan D. Yu310.35
Hans-Joachim Bungartz410.35
George Biros593877.86