Abstract | ||
---|---|---|
Large-scale pre-trained language models have contributed significantly to natural language processing by demonstrating remarkable abilities as few-shot learners. However, their effectiveness depends mainly on scaling the model parameters and prompt design, hindering their implementation in most real-world applications. This study proposes a novel pluggable, extensible, and efficient approach named DifferentiAble pRompT (DART), which can convert small language models into better few-shot learners without any prompt engineering. The main principle behind this approach involves reformulating potential natural language processing tasks into the task of a pre-trained language model and differentially optimizing the prompt template as well as the target label with backpropagation. Furthermore, the proposed approach can be: (i) Plugged to any pre-trained language models; (ii) Extended to widespread classification tasks. A comprehensive evaluation of standard NLP tasks demonstrates that the proposed approach achieves a better few-shot performance. |
Year | Venue | Keywords |
---|---|---|
2022 | International Conference on Learning Representations (ICLR) | prompt-tuning,pre-trained language model,few-shot learning |
DocType | Citations | PageRank |
Conference | 0 | 0.34 |
References | Authors | |
0 | 8 |
Name | Order | Citations | PageRank |
---|---|---|---|
Ningyu Zhang | 1 | 63 | 18.56 |
Luoqiu Li | 2 | 0 | 1.01 |
Xiang Chen | 3 | 46 | 4.34 |
Shumin Deng | 4 | 32 | 10.61 |
Zhen Bi | 5 | 0 | 3.38 |
Chuanqi Tan | 6 | 29 | 9.25 |
Fei Huang | 7 | 2 | 7.54 |
Huanhuan Chen | 8 | 731 | 101.79 |