Abstract | ||
---|---|---|
Recent work has demonstrated the effectiveness of cross-lingual language model pretraining for cross-lingual understanding. In this study, we present the results of two larger multilingual masked language models, with 3.5B and 10.7B parameters. Our two new models dubbed XLM-RXL and XLM-RXXL outperform XLM-R by 1.8% and 2.4% average accuracy on XNLI. Our model also outperforms the RoBERTa-Large model on several English tasks of the GLUE benchmark by 0.3% on average while handling 99 more languages. This suggests larger capacity models for language understanding may obtain strong performance on both high- and low-resource languages. We make our code and models publicly available.(1) |
Year | DOI | Venue |
---|---|---|
2021 | 10.18653/v1/2021.repl4nlp-1.4 | REPL4NLP 2021: PROCEEDINGS OF THE 6TH WORKSHOP ON REPRESENTATION LEARNING FOR NLP |
DocType | Citations | PageRank |
Conference | 0 | 0.34 |
References | Authors | |
0 | 5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Naman Goyal | 1 | 0 | 2.03 |
Jingfei Du | 2 | 19 | 4.47 |
Myle Ott | 3 | 524 | 26.11 |
Giri Anantharaman | 4 | 0 | 0.34 |
Alexis Conneau | 5 | 342 | 15.03 |