Title
Convergence of Deep Neural Networks to a Hierarchical Covariance Matrix Decomposition.
Abstract
We show that in a deep neural network trained with ReLU, the low-lying layers should be replaceable with truncated linearly activated layers. We derive the gradient descent equations in this truncated linear model and demonstrate that --if the distribution of the training data is stationary during training-- the optimal choice for weights in these low-lying layers is the eigenvectors of the covariance matrix of the data. If the training data is random and uniform enough, these eigenvectors can be found using a small fraction of the training data, thus reducing the computational complexity of training. We show how this can be done recursively to form successive, trained layers. At least in the first layer, our tests show that this approach improves classification of images while reducing network size.
Year
Venue
Field
2017
arXiv: Learning
Convergence (routing),Computer science,Algorithm,Theoretical computer science,Covariance matrix,Deep neural networks
DocType
Volume
Citations 
Journal
abs/1703.04757
0
PageRank 
References 
Authors
0.34
0
3
Name
Order
Citations
PageRank
Nima Dehmamy142.07
Neda Rohani2105.71
Aggelos K. Katsaggelos33410340.41