Title
HitNet - Hybrid Ternary Recurrent Neural Network.
Abstract
Quantization is a promising technique to reduce the model size, memory footprint, and computational cost of neural networks for the employment on embedded devices with limited resources. Although quantization has achieved impressive success in convolutional neural networks (CNNs), it still suffers from large accuracy degradation on recurrent neural networks (RNNs), especially in the extremely low-bit cases. In this paper, we first investigate the accuracy degradation of RNNs under different quantization schemes and visualize the distribution of tensor values in the full precision models. Our observation reveals that due to the different distributions of weights and activations, different quantization methods should be used for each part. Accordingly, we propose HitNet, a hybrid ternary RNN, which bridges the accuracy gap between the full precision model and the quantized model with ternary weights and activations. In HitNet, we develop a hybrid quantization method to quantize weights and activations. Moreover, we introduce a sloping factor into the activation functions to address the error-sensitive problem, further closing the mentioned accuracy gap. We test our method on typical RNN models, such as Long-Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU). Overall, HitNet can quantize RNN models into ternary values of { -1, 0, 1} and significantly outperform the state-of-the-art methods towards extremely quantized RNNs. Specifically, we improve the perplexity per word (PPW) of a ternary LSTM on Penn Tree Bank (PTB) corpus from 126 to 110.3 and a ternary GRU from 142 to 113.5.
Year
Venue
Keywords
2018
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018)
convolutional neural networks,long short-term memory,recurrent neural networks,limited resources,memory footprint,recurrent neural network,boltzmann machine
Field
DocType
Volume
Perplexity,Boltzmann machine,Convolutional neural network,Computer science,Recurrent neural network,Algorithm,Quantization (physics),Artificial intelligence,Quantization (signal processing),Memory footprint,Machine learning,Computation
Conference
31
ISSN
Citations 
PageRank 
1049-5258
3
0.38
References 
Authors
0
6
Name
Order
Citations
PageRank
Peiqi Wang1112.52
Xinfeng Xie2526.39
Lei Deng317730.01
Guoqi Li438746.18
Dongsheng Wang537364.93
Yuan Xie66430407.00