Title
CPM: A large-scale generative Chinese Pre-trained language model
Abstract
Pre-trained Language Models (PLMs) have proven to be beneficial for various downstream NLP tasks. Recently, GPT-3, with 175 billion parameters and 570 GB training data, drew a lot of attention due to the capacity of few-shot (even zero-shot) learning. However, applying GPT-3 to address Chinese NLP tasks is still challenging, as the training corpus of GPT-3 is primarily English, and the parameters are not publicly available. In this technical report, we release the Chinese Pre-trained Language Model (CPM) with generative pre-training on large-scale Chinese training data. To the best of our knowledge, CPM, with 2.6 billion parameters and 100 GB Chinese training data, is the largest Chinese pre-trained language model, which could facilitate several downstream Chinese NLP tasks, such as conversation, essay generation, cloze test, and language understanding. Extensive experiments demonstrate that CPM achieves strong performance on many NLP tasks in the settings of few-shot (even zero-shot) learning. The code and parameters are available at https://github.com/TsinghuaAI/CPM.
Year
DOI
Venue
2021
10.1016/j.aiopen.2021.07.001
AI Open
Keywords
DocType
Volume
Pre-trained language model,Zero-shot learning
Journal
2
ISSN
Citations 
PageRank 
2666-6510
0
0.34
References 
Authors
0
25
Name
Order
Citations
PageRank
Zhengyan Zhang191.10
Xu Han2154.94
Hao Zhou300.68
Pei Ke441.42
Yuxian Gu500.34
Deming Ye632.06
Yujia Qin700.68
YuSheng Su800.68
Haozhe Ji902.03
Jian Guan1041.08
Fanchao Qi11127.27
Xiaozhi Wang1254.17
Yanan Zheng1300.34
Guoyang Zeng1411.71
Huanqi Cao1500.68
Shengqi Chen1612.04
Daixuan Li1700.34
Zhenbo Sun1800.68
Zhiyuan Liu192037123.68
Minlie Huang20126090.68
Wentao Han211038.22
Jie Tang225871300.22
Juanzi Li232526154.08
Xiaoyan Zhu242125141.16
Maosong Sun252293162.86