Abstract | ||
---|---|---|
Preordering of source side sentences has proved to be useful in improving statistical machine translation. Most work has used a parser in the source language along with rules to map the source language word order into the target language word order. The requirement to have a source language parser is a major drawback, which we seek to overcome in this paper. Instead of using a parser and then using rules to order the source side sentence we learn a model that can directly reorder source side sentences to match target word order using a small parallel corpus with high-quality word alignments. Our model learns pairwise costs of a word immediately preceding another word. We use the Lin-Kernighan heuristic to find the best source reordering efficiently during training and testing and show that it suffices to provide good quality reordering. We show gains in translation performance based on our reordering model for translating from Hindi to English, Urdu to English (with a public dataset), and English to Hindi. For English to Hindi we show that our technique achieves better performance than a method that uses rules applied to the source side English parse. |
Year | Venue | Keywords |
---|---|---|
2011 | EMNLP | reorder source side sentence,best source reordering,improved machine translation,target language word order,source language,high-quality word alignment,target word order,source language word order,source language parser,word reordering model,source side english parse,source side sentence |
Field | DocType | Volume |
Pairwise comparison,Heuristic,Word order,Computer science,Hindi,Machine translation,Speech recognition,Urdu,Artificial intelligence,Natural language processing,Parsing,Sentence | Conference | D11-1 |
Citations | PageRank | References |
26 | 0.78 | 28 |
Authors | ||
5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Karthik Visweswariah | 1 | 400 | 38.22 |
Rajakrishnan Rajkumar | 2 | 94 | 6.72 |
Ankur Gandhe | 3 | 42 | 5.88 |
Ananthakrishnan Ramanathan | 4 | 89 | 6.59 |
Jiri Navratil | 5 | 314 | 31.36 |