Title
Incorporating Alternate Translations into English Translation Treebank.
Abstract
New annotation guidelines and new processing methods were developed to accommodate English treebank annotation of a parallel English/Chinese corpus of web data that includes alternate English translations (one fluent, one literal) of expressions that are idiomatic in the Chinese source. In previous machine translation programs, alternate translations of idiomatic expressions had been present in untreebanked data only, but due to the high frequency of such expressions in informal genres such as discussion forums, machine translation system developers requested that alternatives be added to the treebanked data as well. In consultation with machine translation researchers, we chose a pragmatic approach of syntactically annotating only the fluent translation, while retaining the alternate literal translation as a segregated node in the tree. Since the literal translation alternates are often incompatible with English syntax, this approach allows us to create fluent trees without losing information. This resource is expected to support machine translation efforts, and the flexibility provided by the alternate translations is an enhancement to the treebank for this purpose.
Year
Venue
Keywords
2014
LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION
English Translation Treebank,alternate translations,parsing,idiomatic expressions
Field
DocType
Citations 
Expression (mathematics),Computer science,Machine translation,Artificial intelligence,Natural language processing,Syntax,Annotation,Machine translation system,Speech recognition,Treebank,Literal translation,Delegation (computing),Linguistics
Conference
2
PageRank 
References 
Authors
0.42
0
5
Name
Order
Citations
PageRank
Ann Bies113620.02
Justin Mott2274.93
Seth Kulick322129.66
Jennifer Garland440.83
Colin Warner5903.53