Title
No Fine-Tuning, No Cry: Robust Svd For Compressing Deep Networks
Abstract
A common technique for compressing a neural network is to compute the k-rank l2 approximation Ak of the matrix A is an element of Rnxd via SVD that corresponds to a fully connected layer (or embedding layer). Here, d is the number of input neurons in the layer, n is the number in the next one, and Ak is stored in O((n+d)k) memory instead of O(nd). Then, a fine-tuning step is used to improve this initial compression. However, end users may not have the required computation resources, time, or budget to run this fine-tuning stage. Furthermore, the original training set may not be available. In this paper, we provide an algorithm for compressing neural networks using a similar initial compression time (to common techniques) but without the fine-tuning step. The main idea is replacing the k-rank l2 approximation with lp, for p is an element of[1,2], which is known to be less sensitive to outliers but much harder to compute. Our main technical result is a practical and provable approximation algorithm to compute it for any p >= 1, based on modern techniques in computational geometry. Extensive experimental results on the GLUE benchmark for compressing the networks BERT, DistilBERT, XLNet, and RoBERTa confirm this theoretical advantage.
Year
DOI
Venue
2021
10.3390/s21165599
SENSORS
Keywords
DocType
Volume
matrix factorization, neural networks compression, robust low rank approximation, Lowner ellipsoid
Journal
21
Issue
ISSN
Citations 
16
1424-8220
0
PageRank 
References 
Authors
0.34
0
4
Name
Order
Citations
PageRank
Tukan Murad102.03
Alaa Maalouf200.34
Matan Weksler300.34
Dan Feldman4945.06