Title
AdaSTE: An Adaptive Straight-Through Estimator to Train Binary Neural Networks
Abstract
We propose a new algorithm for training deep neural networks (DNNs) with binary weights. In particular, we first cast the problem of training binary neural networks (BiNNs) as a bilevel optimization instance and subsequently construct flexible relaxations of this bilevel program. The resulting training method shares its algorithmic simplicity with several existing approaches to train BiNNs, in particular with the straight-through gradient estimator successfully employed in BinaryConnect and subsequent methods. Infact, our proposed method can be interpreted as an adaptive variant of the original straight-through estimator that conditionally (but not always) acts like a linear mapping in the backward pass of error propagation. Experimental results demonstrate that our new algorithm offers favorable performance compared to existing approaches. <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sup> <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sup> This work was partially supported by theWallenberg AI, Autonomous Systems and Software Program (WASP) funded by the Knut and Alice Wallenberg Foundation.
Year
DOI
Venue
2022
10.1109/CVPR52688.2022.00055
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Keywords
DocType
ISSN
Optimization methods, Machine learning
Conference
1063-6919
ISBN
Citations 
PageRank 
978-1-6654-6947-0
0
0.34
References 
Authors
8
4
Name
Order
Citations
PageRank
Huu Le100.34
Rasmus Kjær Høier200.34
Che-Tsung Lin300.34
Christopher Zach4145784.01