Better evaluation metrics lead to better machine translation - Citegraph

Paper Info

Title
Better evaluation metrics lead to better machine translation

Abstract
Many machine translation evaluation metrics have been proposed after the seminal BLEU metric, and many among them have been found to consistently outperform BLEU, demonstrated by their better correlations with human judgment. It has long been the hope that by tuning machine translation systems against these new generation metrics, advances in automatic machine translation evaluation can lead directly to advances in automatic machine translation. However, to date there has been no unambiguous report that these new metrics can improve a state-of-the-art machine translation system over its BLEU-tuned baseline. In this paper, we demonstrate that tuning Joshua, a hierarchical phrase-based statistical machine translation system, with the TESLA metrics results in significantly better human-judged translation quality than the BLEU-tuned baseline. TESLA-M in particular is simple and performs well in practice on large datasets. We release all our implementation under an open source license. It is our hope that this work will encourage the machine translation community to finally move away from BLEU as the unquestioned default and to consider the new generation metrics when tuning their systems.

Year	Venue	Keywords
2011	EMNLP	machine translation community,human-judged translation quality,automatic machine translation,bleu-tuned baseline,better machine translation,state-of-the-art machine translation system,translation system,machine translation evaluation metrics,tuning machine translation system,automatic machine translation evaluation,new generation metrics,better evaluation metrics
Field	DocType	Volume
BLEU,Evaluation of machine translation,Computer science,Machine translation,Machine translation system,Phrase,Human judgment,ROUGE,Natural language processing,Artificial intelligence,Machine learning,License	Conference	D11-1
Citations	PageRank	References
20	2.63	21
Authors
3

Authors (3 rows)

Cited by (20 rows)

References (21 rows)

Name	Order	Citations	PageRank
chang liu	1	87	6.78
Daniel Dahlmeier	2	460	29.67
Hwee Tou Ng	3	4092	300.40

1