Abstract | ||
---|---|---|
We present a novel approach for extracting a minimal synchronous context-free grammar (SCFG) for Hiero-style statistical machine translation using a non-parametric Bayesian framework. Our approach is designed to extract rules that are licensed by the word alignments and heuristically extracted phrase pairs. Our Bayesian model limits the number of SCFG rules extracted, by sampling from the space of all possible hierarchical rules; additionally our informed prior based on the lexical alignment probabilities biases the grammar to extract high quality rules leading to improved generalization and the automatic identification of commonly re-used rules. We show that our Bayesian model is able to extract minimal set of hierarchical phrase rules without impacting the translation quality as measured by the BLEU score. |
Year | Venue | Keywords |
---|---|---|
2011 | WMT@EMNLP | hierarchical phrase rule,novel approach,minimal set,hiero-style statistical machine translation,phrase pair,scfg rule,minimal synchronous context-free grammar,hierarchical phrase-based translation,non-parametric bayesian framework,minimal scfg rule,bayesian model,bayesian extraction,high quality rule |
Field | DocType | Citations |
Heuristic,Bayesian inference,Computer science,Machine translation,Phrase,Synchronous context-free grammar,Grammar,Sampling (statistics),Artificial intelligence,Natural language processing,Machine learning,Bayesian probability | Conference | 5 |
PageRank | References | Authors |
0.41 | 15 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Baskaran Sankaran | 1 | 155 | 13.65 |
Gholamreza Haffari | 2 | 381 | 59.13 |
Anoop Sarkar | 3 | 1017 | 88.82 |