Title
Exploring diverse features for statistical machine translation model pruning
Abstract
In phrase-based and hierarchical phrase-based statistical machine translation systems, translation performance depends heavily on the size and quality of the translation table. To meet the requirements of making a real-time response, some research has been performed to filter the translation table. However, most existing methods are always based on one or two constraints that act as hard rules, such as not allowing phrase-pairs with low translation probabilities. These approaches sometimes make constraints rigid because they consider only a single factor instead of composite factors. Based on the considerations above, in this paper, we propose a machine learning-based framework that integrates multiple features for translation model pruning. Experimental results show that our framework is effective by pruning 80% of the phrase-pairs and 70% of the hierarchical rules, while retaining the quality of the translation models when using the BLEU evaluation metric. Our study further shows that our method can select the most useful phrase-pairs and rules, including those that are low in frequency but still very useful.
Year
DOI
Venue
2015
10.1109/TASLP.2015.2456413
IEEE/ACM Trans. Audio, Speech & Language Processing
Keywords
DocType
Volume
Syntactics,Decoding,Training data,Training,Data models,Bidirectional control,IEEE transactions
Journal
23
Issue
ISSN
Citations 
11
2329-9290
1
PageRank 
References 
Authors
0.39
29
3
Name
Order
Citations
PageRank
Mei Tu190.85
Yu Zhou2346.58
Chengqing Zong31004102.38