Leveraging Monolingual Data with Self-Supervision for Multilingual Neural Machine Translation - Citegraph

Paper Info

Title
Leveraging Monolingual Data with Self-Supervision for Multilingual Neural Machine Translation

Abstract
Over the last few years two promising research directions in low-resource neural machine translation (NMT) have emerged. The first focuses on utilizing high-resource languages to improve the quality of low-resource languages via multilingual NMT. The second direction employs monolingual data with self-supervision to pre-train translation models, followed by fine-tuning on small amounts of supervised data. In this work, we join these two lines of research and demonstrate the efficacy of monolingual data with self-supervision in multilingual NMT. We offer three major results: (i) Using monolingual data significantly boosts the translation quality of low-resource languages in multilingual models. (ii) Self-supervision improves zero-shot translation quality in multilingual models. (iii) Leveraging monolingual data with self-supervision provides a viable path towards adding new languages to multilingual models, getting up to 33 BLEU on ro-en translation without any parallel data or back-translation.

Year	Venue	DocType
2020	ACL	Conference
ISSN	Citations	PageRank
ACL 2020	0	0.34
References	Authors
0	8

Authors (8 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Aditya Siddhant	1	7	4.46
Ankur Bapna	2	36	8.45
Yuan Cao	3	548	35.60
Orhan Firat	4	281	29.13
Xu Chen	5	30	5.73
Sneha Kudugunta	6	17	1.35
Naveen Arivazhagan	7	24	3.98
Yonghui Wu	8	1065	72.78

1