Auto-Sizing the Transformer Network - Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation. - Citegraph

Paper Info

Title
Auto-Sizing the Transformer Network - Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation.

Abstract
Neural sequence-to-sequence models, particularly the Transformer, are the state of the art in machine translation. Yet these neural networks are very sensitive to architecture and hyperparameter settings. Optimizing these settings by grid or random search is computationally expensive because it requires many training runs. In this paper, we incorporate architecture search into a single training run through auto-sizing, which uses regularization to delete neurons in a network over the course of training. On very low-resource language pairs, we show that auto-sizing can improve BLEU scores by up to 3.9 points while removing one-third of the parameters from the model.

Year	DOI	Venue
2019	10.18653/v1/D19-5625	NGT@EMNLP-IJCNLP
Field	DocType	Volume
Automotive engineering,Computer science,Machine translation,Transformer,Sizing	Conference	D19-56
Citations	PageRank	References
0	0.34	0
Authors
5

Authors (5 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Kenton Murray	1	0	0.34
Jeffery Kinnison	2	0	0.68
Toan Nguyen	3	55	15.70
Walter J. Scheirer	4	773	52.81
David Chiang	5	2843	144.76

1