Abstract | ||
---|---|---|
Following two preceding WMT Shared Task on Parallel Corpus Filtering (Koehn et al., 2018, 2019), we posed again the challenge of assigning sentence-level quality scores for very noisy corpora of sentence pairs crawled from the web, with the goal of sub-selecting the highest-quality data to be used to train ma-chine translation systems. This year, the task tackled the low resource condition of Pashto–English and Khmer–English and also included the challenge of sentence alignment from document pairs. |
Year | Venue | DocType |
---|---|---|
2020 | WMT@EMNLP | Conference |
Citations | PageRank | References |
0 | 0.34 | 0 |
Authors | ||
6 |
Name | Order | Citations | PageRank |
---|---|---|---|
Philipp Koehn | 1 | 0 | 1.69 |
Vishrav Chaudhary | 2 | 8 | 8.26 |
Ahmed El-Kishky | 3 | 0 | 0.68 |
Naman Goyal | 4 | 0 | 1.01 |
Peng-Jen Chen | 5 | 0 | 1.35 |
Francisco Guzmán | 6 | 54 | 13.51 |