Towards Compact and Fast Neural Machine Translation Using a Combined Method. - Citegraph

Paper Info

Title
Towards Compact and Fast Neural Machine Translation Using a Combined Method.

Abstract
Neural Machine Translation (NMT) lays intensive burden on computation and cost. It is challenge to deploy NMT models on the devices with limited computation and memory budgets. This paper presents four stage pipeline to compress model and speed up the decoding for NMT. Our method first introduces a compact architecture based on convolutional encoder and weight shared embeddings. Then weight pruning is applied to obtain sparse model. Next, we propose fast sequence interpolation approach which enables the decoding to achieve performance on par with the beam search. Hence, the time-consuming beam search can be replaced by simple greedy decoding. Finally, vocabulary selection is used to reduce the computation of softmax layer. Our final model achieves 10 × speedup, 17 × parameters reduction, 35MB storage size and comparable performance compared to the baseline model.

Year	DOI	Venue
2017	10.18653/v1/d17-1154	EMNLP
Field	DocType	Volume
Softmax function,Computer science,Interpolation,Beam search,Algorithm,Artificial intelligence,Encoder,Decoding methods,Artificial neural network,Machine learning,Speedup,Computation	Conference	D17-1
Citations	PageRank	References
3	0.38	14
Authors
5

Authors (5 rows)

Cited by (3 rows)

References (14 rows)

Name	Order	Citations	PageRank
Xiaowei Zhang	1	12	4.06
Wei Chen	2	9	2.86
Feng Wang	3	27	3.51
Shuang Xu	4	4	2.76
Bo Xu	5	241	36.59

1