Title | ||
---|---|---|
XLDA: Cross-Lingual Data Augmentation for Natural Language Inference and Question Answering. |
Abstract | ||
---|---|---|
While natural language processing systems often focus on a single language, multilingual transfer learning has the potential to improve performance, especially for low-resource languages. We introduce XLDA, cross-lingual data augmentation, a method that replaces a segment of the input text with its translation in another language. XLDA enhances performance of all 14 tested languages of the cross-lingual natural language inference (XNLI) benchmark. With improvements of up to $4.8\%$, training with XLDA achieves state-of-the-art performance for Greek, Turkish, and Urdu. XLDA is in contrast to, and performs markedly better than, a more naive approach that aggregates examples in various languages in a way that each example is solely in one language. On the SQuAD question answering task, we see that XLDA provides a $1.0\%$ performance increase on the English evaluation set. Comprehensive experiments suggest that most languages are effective as cross-lingual augmentors, that XLDA is robust to a wide range of translation quality, and that XLDA is even more effective for randomly initialized models than for pretrained models. |
Year | Venue | DocType |
---|---|---|
2019 | arXiv: Computation and Language | Journal |
Volume | Citations | PageRank |
abs/1905.11471 | 0 | 0.34 |
References | Authors | |
0 | 5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Jasdeep Singh | 1 | 3 | 1.80 |
McCann, Bryan | 2 | 105 | 7.04 |
nitish shirish keskar | 3 | 325 | 16.71 |
Caiming Xiong | 4 | 969 | 69.56 |
Richard Socher | 5 | 6770 | 230.61 |