Abstract | ||
---|---|---|
Sharing source and target side vocabularies and word embeddings has been a popular practice in neural machine translation (briefly, NMT) for similar languages (e.g., English to French or German translation). The success of such word-level sharing motivates us to move one step further: we consider model-level sharing and tie the whole parts of the encoder and decoder of an NMT model. We share the encoder and decoder of Transformer (Vaswani et al. 2017), the state-of-the-art NMT model, and obtain a compact model named Tied Transformer. Experimental results demonstrate that such a simple method works well for both similar and dissimilar language pairs. We empirically verify our framework for both supervised NMT and unsupervised NMT: we achieve a 35:52 BLEU score on IWSLT 2014 German to English translation, 28:98/29:89 BLEU scores on WMT 2014 English to German translation without/with monolingual data, and a 22:05 BLEU score on WMT 2016 unsupervised German to English translation. |
Year | Venue | Field |
---|---|---|
2019 | THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE | BLEU,Computer science,Machine translation,Encoder,Natural language processing,Artificial intelligence,Machine learning,German |
DocType | Citations | PageRank |
Conference | 2 | 0.36 |
References | Authors | |
0 | 6 |
Name | Order | Citations | PageRank |
---|---|---|---|
Yingce Xia | 1 | 130 | 19.23 |
He, Tianyu | 2 | 11 | 2.72 |
Xu Tan | 3 | 88 | 23.94 |
Fei Tian | 4 | 160 | 11.88 |
Di He | 5 | 154 | 19.76 |
Tao Qin | 6 | 2384 | 147.25 |