HitNet - Hybrid Ternary Recurrent Neural Network. - Citegraph

Paper Info

Title
HitNet - Hybrid Ternary Recurrent Neural Network.

Abstract
Quantization is a promising technique to reduce the model size, memory footprint, and computational cost of neural networks for the employment on embedded devices with limited resources. Although quantization has achieved impressive success in convolutional neural networks (CNNs), it still suffers from large accuracy degradation on recurrent neural networks (RNNs), especially in the extremely low-bit cases. In this paper, we first investigate the accuracy degradation of RNNs under different quantization schemes and visualize the distribution of tensor values in the full precision models. Our observation reveals that due to the different distributions of weights and activations, different quantization methods should be used for each part. Accordingly, we propose HitNet, a hybrid ternary RNN, which bridges the accuracy gap between the full precision model and the quantized model with ternary weights and activations. In HitNet, we develop a hybrid quantization method to quantize weights and activations. Moreover, we introduce a sloping factor into the activation functions to address the error-sensitive problem, further closing the mentioned accuracy gap. We test our method on typical RNN models, such as Long-Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU). Overall, HitNet can quantize RNN models into ternary values of { -1, 0, 1} and significantly outperform the state-of-the-art methods towards extremely quantized RNNs. Specifically, we improve the perplexity per word (PPW) of a ternary LSTM on Penn Tree Bank (PTB) corpus from 126 to 110.3 and a ternary GRU from 142 to 113.5.

Year	Venue	Keywords
2018	ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018)	convolutional neural networks,long short-term memory,recurrent neural networks,limited resources,memory footprint,recurrent neural network,boltzmann machine
Field	DocType	Volume
Perplexity,Boltzmann machine,Convolutional neural network,Computer science,Recurrent neural network,Algorithm,Quantization (physics),Artificial intelligence,Quantization (signal processing),Memory footprint,Machine learning,Computation	Conference	31
ISSN	Citations	PageRank
1049-5258	3	0.38
References	Authors
0	6

Authors (6 rows)

Cited by (3 rows)

References (0 rows)

Name	Order	Citations	PageRank
Peiqi Wang	1	11	2.52
Xinfeng Xie	2	52	6.39
Lei Deng	3	177	30.01
Guoqi Li	4	387	46.18
Dongsheng Wang	5	373	64.93
Yuan Xie	6	6430	407.00

1