Raise a Child in Large Language Model - Towards Effective and Generalizable Fine-tuning. - Citegraph

Paper Info

Title
Raise a Child in Large Language Model - Towards Effective and Generalizable Fine-tuning.

Abstract
Recent pretrained language models extend from millions to billions of parameters. Thus the need to fine-tune an extremely large pretrained model with a limited training corpus arises in various downstream tasks. In this paper, we propose a straightforward yet effective fine-tuning technique, Child-Tuning, which updates a subset of parameters (called child network) of large pretrained models via strategically masking out the gradients of the non-child network during the backward process. Experiments on various downstream tasks in GLUE benchmark show that Child-Tuning consistently outperforms the vanilla fine-tuning by 1.5~8.6 average score among four different pretrained models, and surpasses the prior fine-tuning techniques by 0.6~1.3 points. Furthermore, empirical results on domain transfer and task transfer show that Child-Tuning can obtain better generalization performance by large margins.

Year	Venue	DocType
2021	EMNLP	Conference
Volume	Citations	PageRank
2021.emnlp-main	0	0.34
References	Authors
0	7

Authors (7 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Xu Runxin	1	0	1.35
Fuli Luo	2	10	6.20
Zhiyuan Zhang	3	0	0.34
Chuanqi Tan	4	0	0.34
Baobao Chang	5	0	0.68
Songfang Huang	6	0	0.68
Fei Huang	7	2	7.54

1