Title
InfoXLM: An Information-Theoretic Framework for Cross-Lingual Language Model Pre-Training
Abstract
In this work, we formulate cross-lingual language model pre-training as maximizing mutual information between multilingual-multi-granularity texts. The unified view helps us to better understand the existing methods for learning cross-lingual representations. More importantly, the information-theoretic framework inspires us to propose a pre-training task based on contrastive learning. Given a bilingual sentence pair, we regard them as two views of the same meaning, and encourage their encoded representations to be more similar than the negative examples. By leveraging both monolingual and parallel corpora, we jointly train the pretext tasks to improve the cross-lingual transferability of pre-trained models. Experimental results on several benchmarks show that our approach achieves considerably better performance. The code and pre-trained models are available at http://aka.ms/infoxlm.
Year
Venue
DocType
2021
NAACL-HLT
Conference
Citations 
PageRank 
References 
0
0.34
0
Authors
10
Name
Order
Citations
PageRank
Zewen Chi122.39
Li Dong258231.86
Furu Wei31956107.57
Nan Yang458322.70
Saksham Singhal521.71
Wenhui Wang601.01
Xia Song702.03
Xian-Ling Mao89925.19
Heyan Huang917361.47
Ming Zhou104262251.74