Title | ||
---|---|---|
Data Augmentation By Concatenation For Low-Resource Translation: A Mystery And A Solution |
Abstract | ||
---|---|---|
In this paper, we investigate the driving factors behind concatenation, a simple but effective data augmentation method for low-resource neural machine translation. Our experiments suggest that discourse context is unlikely the cause for concatenation improving BLEU by about +1 across four language pairs. Instead, we demonstrate that the improvement comes from three other factors unrelated to discourse: context diversity, length diversity, and (to a lesser extent) position shifting. |
Year | Venue | DocType |
---|---|---|
2021 | IWSLT 2021: THE 18TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE TRANSLATION | Conference |
Citations | PageRank | References |
0 | 0.34 | 0 |
Authors | ||
3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Toan Nguyen | 1 | 0 | 0.68 |
Kenton Murray | 2 | 0 | 0.34 |
David Chiang | 3 | 2843 | 144.76 |