Exploring diverse features for statistical machine translation model pruning - Citegraph

Paper Info

Title
Exploring diverse features for statistical machine translation model pruning

Abstract
In phrase-based and hierarchical phrase-based statistical machine translation systems, translation performance depends heavily on the size and quality of the translation table. To meet the requirements of making a real-time response, some research has been performed to filter the translation table. However, most existing methods are always based on one or two constraints that act as hard rules, such as not allowing phrase-pairs with low translation probabilities. These approaches sometimes make constraints rigid because they consider only a single factor instead of composite factors. Based on the considerations above, in this paper, we propose a machine learning-based framework that integrates multiple features for translation model pruning. Experimental results show that our framework is effective by pruning 80% of the phrase-pairs and 70% of the hierarchical rules, while retaining the quality of the translation models when using the BLEU evaluation metric. Our study further shows that our method can select the most useful phrase-pairs and rules, including those that are low in frequency but still very useful.

Year	DOI	Venue
2015	10.1109/TASLP.2015.2456413	IEEE/ACM Trans. Audio, Speech & Language Processing
Keywords	DocType	Volume
Syntactics,Decoding,Training data,Training,Data models,Bidirectional control,IEEE transactions	Journal	23
Issue	ISSN	Citations
11	2329-9290	1
PageRank	References	Authors
0.39	29	3

Authors (3 rows)

Cited by (1 rows)

References (29 rows)

Name	Order	Citations	PageRank
Mei Tu	1	9	0.85
Yu Zhou	2	34	6.58
Chengqing Zong	3	1004	102.38

1