Abstract | ||
---|---|---|
In this paper we show how to learn rules to improve the performance of a machine translation system. Given a system consisting of two translation functions (one from language A to language B and one from B to A), training text is translated from A to B and back again to A. Using these two translations, differences in knowledge between the two translation functions are identified, and rules are learned to improve the functions. Context-independent rules are learned where the information suggests only a single possible translation for a word. When there are multiple alternate translations for a word, a likelihood ratio test is used to identify words that co-occur with each case significantly. These words are then used as context in context-dependent rules. Applied on the Pan American Health Organization corpus of 20,084 sentences, the learned rules improve the understandability of the translation produced by the SDL International engine on 78% of sentences, with high precision. |
Year | DOI | Venue |
---|---|---|
2003 | 10.1007/978-3-540-39857-8_20 | Lecture Notes in Artificial Intelligence |
Keywords | Field | DocType |
context dependent,likelihood ratio test,machine translation | Rule-based machine translation,Example-based machine translation,Computer science,Machine translation,Speech recognition,Synchronous context-free grammar,Machine translation software usability,Transfer-based machine translation,Natural language processing,Artificial intelligence,Computer-assisted translation,Sentence | Conference |
Volume | ISSN | Citations |
2837 | 0302-9743 | 2 |
PageRank | References | Authors |
0.53 | 11 | 2 |
Name | Order | Citations | PageRank |
---|---|---|---|
David Kauchak | 1 | 363 | 25.92 |
Charles Elkan | 2 | 5118 | 572.94 |