Abstract | ||
---|---|---|
This paper describes the Tencent AI Lab’s submission of the WMT 2020 shared task on chat translation in English-German. Our neural machine translation (NMT) systems are built on sentence-level, document-level, non-autoregressive (NAT) and pretrained models. We integrate a number of advanced techniques into our systems, including data selection, back/forward translation, larger batch learning, model ensemble, finetuning as well as system combination. Specifically, we proposed a hybrid data selection method to select high-quality and in-domain sentences from out-of-domain data. To better capture the source contexts, we exploit to augment NAT models with evolved cross-attention. Furthermore, we explore to transfer general knowledge from four different pre-training language models to the downstream translation task. In general, we present extensive experimental results for this new translation task. Among all the participants, our German-to-English primary system is ranked the second in terms of BLEU scores. |
Year | Venue | DocType |
---|---|---|
2020 | WMT@EMNLP | Conference |
Citations | PageRank | References |
0 | 0.34 | 22 |
Authors | ||
6 |
Name | Order | Citations | PageRank |
---|---|---|---|
Longyue Wang | 1 | 72 | 18.24 |
Zhaopeng Tu | 2 | 518 | 39.95 |
Xing Wang | 3 | 58 | 10.07 |
Li Ding | 4 | 26 | 7.02 |
Ding Liang | 5 | 161 | 17.45 |
Shuming Shi | 6 | 620 | 58.27 |