Abstract | ||
---|---|---|
This paper presents a maximum entropy ma- chine translation system using a minimal set of translation blocks (phrase-pairs). While recent phrase-based statistical machine trans- lation (SMT) systems achieve significant im- provement over the original source-channel sta- tistical translation models, they 1) use a large inventory of blocks which have significant over- lap and 2) limit the use of training to just a few parameters (on the order of ten). In con- trast, we show that our proposed minimalist system (DTM2) achieves equal or better per- formance by 1) recasting the translation prob- lem in the traditional statistical modeling ap- proach using blocks with no overlap and 2) re- lying on training most system parameters (on the order of millions or larger). The new model is a direct translation model (DTM) formu- lation which allows easy integration of addi- tional/alternative views of both source and tar- get sentences such as segmentation for a source language such as Arabic, part-of-speech of both source and target, etc. We show improvements over a state-of-the-art phrase-based decoder in Arabic-English translation. |
Year | Venue | Keywords |
---|---|---|
2007 | HLT-NAACL | statistical model,part of speech,maximum entropy |
Field | DocType | Citations |
Rule-based machine translation,Example-based machine translation,Segmentation,Computer science,Machine translation,Phrase,Natural language processing,Artificial intelligence,Transfer-based machine translation,Statistical model,Principle of maximum entropy | Conference | 26 |
PageRank | References | Authors |
1.30 | 2 | 2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Abraham Ittycheriah | 1 | 534 | 61.23 |
Salim Roukos | 2 | 6248 | 845.50 |