Title
Better hypothesis testing for statistical machine translation: controlling for optimizer instability
Abstract
In statistical machine translation, a researcher seeks to determine whether some innovation (e.g., a new feature, model, or inference algorithm) improves translation quality in comparison to a baseline system. To answer this question, he runs an experiment to evaluate the behavior of the two systems on held-out data. In this paper, we consider how to make such experiments more statistically reliable. We provide a systematic analysis of the effects of optimizer instability---an extraneous variable that is seldom controlled for---on experimental outcomes, and make recommendations for reporting results more accurately.
Year
Venue
Keywords
2011
ACL (Short Papers)
translation quality,statistical machine translation,held-out data,extraneous variable,inference algorithm,hypothesis testing,baseline system,new feature,experimental outcome,systematic analysis,optimizer instability
Field
DocType
Volume
Computer science,Inference,Machine translation,Artificial intelligence,Natural language processing,Baseline system,Statistical hypothesis testing,Machine learning
Conference
P11-2
Citations 
PageRank 
References 
202
4.94
25
Authors
4
Search Limit
100202
Name
Order
Citations
PageRank
Jonathan H. Clark141116.42
chris dyer25438232.28
alon lavie32606177.91
Noah A. Smith45867314.27