Title
Lattice BLEU oracles in machine translation
Abstract
The search space of Phrase-Based Statistical Machine Translation (PBSMT) systems can be represented as a directed acyclic graph (lattice). By exploring this search space, it is possible to analyze and understand the failures of PBSMT systems. Indeed, useful diagnoses can be obtained by computing the so-called oracle hypotheses, which are hypotheses in the search space that have the highest quality score. For standard SMT metrics, this problem is, however, NP-hard and can only be solved approximately. In this work, we present two new methods for efficiently computing oracles on lattices: the first one is based on a linear approximation of the corpus bleu score and is solved using generic shortest distance algorithms; the second one relies on an Integer Linear Programming (ILP) formulation of the oracle decoding that incorporates count clipping constraints. It can either be solved directly using a standard ILP solver or using Lagrangian relaxation techniques. These new decoders are evaluated and compared with several alternatives from the literature for three language pairs, using lattices produced by two PBSMT systems.
Year
DOI
Venue
2013
10.1145/2513147
TSLP
Keywords
Field
DocType
lattice bleu oracle,standard ilp solver,highest quality score,pbsmt system,new decoder,corpus bleu score,so-called oracle hypothesis,search space,standard smt metrics,machine translation,new method,oracle decoding,integer linear programming,bleu,lattices
Linear approximation,Computer science,Machine translation,Oracle,Algorithm,Theoretical computer science,Directed acyclic graph,Integer programming,Decoding methods,Solver,Lagrangian relaxation
Journal
Volume
Issue
ISSN
10
4
1550-4875
Citations 
PageRank 
References 
3
0.39
36
Authors
3
Name
Order
Citations
PageRank
Artem Sokolov115316.08
Guillaume Wisniewski211827.53
Franccois Yvon350.80