BAM! Born-Again Multi-Task Networks for Natural Language Understanding - Citegraph

Paper Info

Title
BAM! Born-Again Multi-Task Networks for Natural Language Understanding

Abstract
It can be challenging to train multi-task neural networks that outperform or even match their single-task counterparts. To help address this, we propose using knowledge distillation where single-task models teach a multi-task model. We enhance this training with teacher annealing, a novel method that gradually transitions the model from distillation to supervised learning, helping the multi-task model surpass its single-task teachers. We evaluate our approach by multi-task fine-tuning BERT on the GLUE benchmark. Our method consistently improves over standard single-task and multi-task training.

Year	DOI	Venue
2019	10.18653/v1/p19-1595	ACL (1)
DocType	Volume	Citations
Conference	P19-1	2
PageRank	References	Authors
0.36	0	5

Authors (5 rows)

Cited by (2 rows)

References (0 rows)

Name	Order	Citations	PageRank
Kevin Clark	1	102	5.93
Minh-Thang Luong	2	1852	71.35
Urvashi Khandelwal	3	275	10.28
Christopher D. Manning	4	22579	1126.22
Quoc V. Le	5	8501	366.59

1