Smaller alignment models for better translations: unsupervised word alignment with the l0-norm - Citegraph

Paper Info

Title
Smaller alignment models for better translations: unsupervised word alignment with the l0-norm

Abstract
Two decades after their invention, the IBM word-based translation models, widely available in the GIZA++ toolkit, remain the dominant approach to word alignment and an integral part of many statistical translation systems. Although many models have surpassed them in accuracy, none have supplanted them in practice. In this paper, we propose a simple extension to the IBM models: an l0 prior to encourage sparsity in the word-to-word translation model. We explain how to implement this extension efficiently for large-scale data (also released as a modification to GIZA++) and demonstrate, in experiments on Czech, Arabic, Chinese, and Urdu to English translation, significant improvements over IBM Model 4 in both word alignment (up to +6.7 F1) and translation quality (up to +1.4 B ).

Year	Venue	Keywords
2012	ACL	translation quality,IBM word-based translation model,Smaller alignment model,better translation,unsupervised word alignment,word alignment,IBM model,dominant approach,statistical translation system,English translation,word-to-word translation model,simple extension,integral part
DocType	Volume	Citations
Conference	aclanthology.org	0
PageRank	References	Authors
0.34	18	3

Authors (3 rows)

Cited by (0 rows)

References (18 rows)

Name	Order	Citations	PageRank
Ashish Vaswani	1	901	32.81
Liang Huang	2	1484	75.40
David Chiang	3	2843	144.76

1