Title
Application of clause alignment for statistical machine translation
Abstract
The paper presents a new resource light flexible method for clause alignment which combines the Gale-Church algorithm with internally collected textual information. The method does not resort to any pre-developed linguistic resources which makes it very appropriate for resource light clause alignment. We experiment with a combination of the method with the original Gale-Church algorithm (1993) applied for clause alignment. The performance of this flexible method, as it will be referred to hereafter, is measured over a specially designed test corpus. The clause alignment is explored as means to provide improved training data for the purposes of Statistical Machine Translation (SMT). A series of experiments with Moses demonstrate ways to modify the parallel resource and effects on translation quality: (1) baseline training with a Bulgarian-English parallel corpus aligned at sentence level; (2) training based on parallel clause pairs; (3) training with clause reordering, where clauses in each source language (SL) sentence are reordered according to order of the clauses in the target language (TL) sentence. Evaluation is based on BLEU score and shows small improvement when using the clause aligned corpus.
Year
Venue
Keywords
2012
SSST@ACL
statistical machine translation,light flexible method,bulgarian-english parallel corpus,baseline training,resource light clause alignment,clause reordering,parallel clause pair,new resource,improved training data,clause alignment,flexible method
Field
DocType
Citations 
Training set,Textual information,Computer science,Machine translation,Speech recognition,Artificial intelligence,Natural language processing,Sentence
Conference
1
PageRank 
References 
Authors
0.39
13
9