Title
Auto-Sizing the Transformer Network - Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation.
Abstract
Neural sequence-to-sequence models, particularly the Transformer, are the state of the art in machine translation. Yet these neural networks are very sensitive to architecture and hyperparameter settings. Optimizing these settings by grid or random search is computationally expensive because it requires many training runs. In this paper, we incorporate architecture search into a single training run through auto-sizing, which uses regularization to delete neurons in a network over the course of training. On very low-resource language pairs, we show that auto-sizing can improve BLEU scores by up to 3.9 points while removing one-third of the parameters from the model.
Year
DOI
Venue
2019
10.18653/v1/D19-5625
NGT@EMNLP-IJCNLP
Field
DocType
Volume
Automotive engineering,Computer science,Machine translation,Transformer,Sizing
Conference
D19-56
Citations 
PageRank 
References 
0
0.34
0
Authors
5
Name
Order
Citations
PageRank
Kenton Murray100.34
Jeffery Kinnison200.68
Toan Nguyen35515.70
Walter J. Scheirer477352.81
David Chiang52843144.76