Abstract | ||
---|---|---|
Until quite recently, extending Phrase-based Statistical Machine Translation (PBSMT) with syntactic structure caused system per- formance to deteriorate. In this work we show that incorporating lexical syntactic de- scriptions in the form of supertags can yield significantly better PBSMT systems. We de- scribe a novel PBSMT model that integrates supertags into the target language model and the target side of the translation model. Two kinds of supertags are employed: those from Lexicalized Tree-Adjoining Grammar and Combinatory Categorial Grammar. De- spite the differences between these two ap- proaches, the supertaggers give similar im- provements. In addition to supertagging, we also explore the utility of a surface global grammaticality measure based on combina- tory operators. We perform various experi- ments on the Arabic to English NIST 2005 test set addressing issues such as sparseness, scalability and the utility of system subcom- ponents. Our best result (0.4688 BLEU) improves by 6.1% relative to a state-of-the- art PBSMT model, which compares very favourably with the leading systems on the NIST 2005 task. |
Year | Venue | Keywords |
---|---|---|
2006 | ACL | combinatory categorial grammar |
Field | DocType | Volume |
Computer science,Machine translation,Phrase,Synchronous context-free grammar,Speech recognition,NIST,Machine translation software usability,Combinatory categorial grammar,Natural language processing,Artificial intelligence,Syntax,Language model | Conference | P07-1 |
Citations | PageRank | References |
32 | 1.09 | 12 |
Authors | ||
3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Hany Hassan | 1 | 277 | 26.16 |
Khalil Sima'an | 2 | 443 | 50.32 |
Andy Way | 3 | 881 | 126.78 |