Title
A Test Suite for the Evaluation of Portuguese-English Machine Translation
Abstract
This paper describes the development of the first test suite for the language direction Portuguese-English. Designed for fine-grained linguistic analysis, the test suite comprises 330 test sentences for 66 linguistic phenomena and 14 linguistic categories. Eight different MT systems were compared using quantitative and qualitative methods via the test suite: DeepL, Google Sheets, Google Translator, Microsoft Translator, Reverso, Systran, Yandex and an internally built NMT system trained over 30 h on 2,5M sentences. It was found that ambiguity, named entity & terminology and verb valency are the categories where MT systems struggle most. Negation, pronouns, subordination, verb tense/aspect/mood and false friends are the categories where MT systems perform best.
Year
DOI
Venue
2022
10.1007/978-3-030-98305-5_2
COMPUTATIONAL PROCESSING OF THE PORTUGUESE LANGUAGE, PROPOR 2022
Keywords
DocType
Volume
Machine translation, Evaluation, Portuguese, Test suite
Conference
13208
ISSN
Citations 
PageRank 
0302-9743
0
0.34
References 
Authors
0
4
Name
Order
Citations
PageRank
Mariana Avelino100.34
Vivien Macketanz200.34
Eleftherios Avramidis300.34
Sebastian Moller400.34