Distilling Knowledge For Search-Based Structured Prediction - Citegraph

Paper Info

Title
Distilling Knowledge For Search-Based Structured Prediction

Abstract
Many natural language processing tasks can be modeled into structured prediction and solved as a search problem. In this paper, we distill an ensemble of multiple models trained with different initialization into a single model. In addition to learning to match the ensemble's probability output on the reference states, we also use the ensemble to explore the search space and learn from the encountered states in the exploration. Experimental results on two typical search-based structured prediction tasks - transition-based dependency parsing and neural machine translation show that distillation can effectively improve the single model's performance and the final model achieves improvements of 1.32 in LAS and 2.65 in BLEU score on these two tasks respectively over strong baselines and it outperforms the greedy structured prediction models in previous literatures.

Year	DOI	Venue
2018	10.18653/v1/p18-1129	PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL), VOL 1
DocType	Volume	Citations
Conference	abs/1805.11224	1
PageRank	References	Authors
0.38	26	5

Authors (5 rows)

Cited by (1 rows)

References (26 rows)

Name	Order	Citations	PageRank
Yijia Liu	1	49	7.34
Wanxiang Che	2	711	66.39
Huaipeng Zhao	3	1	0.71
Bing Qin	4	1076	72.82
Ting Liu	5	2735	232.31

1