Title
Building an Arabic Machine Translation Post-Edited Corpus: Guidelines and Annotation.
Abstract
We present our guidelines and annotation procedure to create a human corrected machine translated post-edited corpus for the Modern Standard Arabic. Our overarching goal is to use the annotated corpus to develop automatic machine translation post-editing systems for Arabic that can be used to help accelerate the human revision process of translated texts. The creation of any manually annotated corpus usually presents many challenges. In order to address these challenges, we created comprehensive and simplified annotation guidelines which were used by a team of five annotators and one lead annotator. In order to ensure a high annotation agreement between the annotators, multiple training sessions were held and regular inter-annotator agreement measures were performed to check the annotation quality. The created corpus of manual post-edited translations of English to Arabic articles is the largest to date for this language pair.
Year
Venue
Keywords
2016
LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION
Post-Editing,Guidelines,Annotation
Field
DocType
Citations 
Annotation,Computer science,Arabic machine translation,Machine translation,Text corpus,Speech recognition,Machine translation software usability,Corpus linguistics,Natural language processing,Artificial intelligence,Linguistics
Conference
3
PageRank 
References 
Authors
0.40
0
6
Name
Order
Citations
PageRank
Wajdi Zaghouani119721.27
Nizar Habash21833145.59
Ossama Obeid3706.43
Behrang Mohit418816.06
houda bouamor58817.62
Kemal Oflazer678198.46