Title | ||
---|---|---|
Raise a Child in Large Language Model - Towards Effective and Generalizable Fine-tuning. |
Abstract | ||
---|---|---|
Recent pretrained language models extend from millions to billions of parameters. Thus the need to fine-tune an extremely large pretrained model with a limited training corpus arises in various downstream tasks. In this paper, we propose a straightforward yet effective fine-tuning technique, Child-Tuning, which updates a subset of parameters (called child network) of large pretrained models via strategically masking out the gradients of the non-child network during the backward process. Experiments on various downstream tasks in GLUE benchmark show that Child-Tuning consistently outperforms the vanilla fine-tuning by 1.5~8.6 average score among four different pretrained models, and surpasses the prior fine-tuning techniques by 0.6~1.3 points. Furthermore, empirical results on domain transfer and task transfer show that Child-Tuning can obtain better generalization performance by large margins. |
Year | Venue | DocType |
---|---|---|
2021 | EMNLP | Conference |
Volume | Citations | PageRank |
2021.emnlp-main | 0 | 0.34 |
References | Authors | |
0 | 7 |
Name | Order | Citations | PageRank |
---|---|---|---|
Xu Runxin | 1 | 0 | 1.35 |
Fuli Luo | 2 | 10 | 6.20 |
Zhiyuan Zhang | 3 | 0 | 0.34 |
Chuanqi Tan | 4 | 0 | 0.34 |
Baobao Chang | 5 | 0 | 0.68 |
Songfang Huang | 6 | 0 | 0.68 |
Fei Huang | 7 | 2 | 7.54 |