Abstract | ||
---|---|---|
Bidirectional Encoder Representations from Transformers (BERT) has shown marvelous improvements across various NLP tasks, and various variants have been proposed to further improve the performance of the pre-trained models. In this paper, we target on revisiting Chinese pre-trained models to examine their effectiveness in a non-English language and release the Chinese pre-trained model series to the community. We also propose a simple but effective model called MacBERT, which improves upon RoBERTa in several ways, especially the masking strategy. We carried out extensive experiments on various Chinese NLP tasks, covering sentence-level to document-level, to revisit the existing pre-trained models as well as the proposed MacBERT. Experimental results show that MacBERT could achieve state-of-the-art performances on many NLP tasks, and we also ablate details with several findings that may help future research. |
Year | DOI | Venue |
---|---|---|
2020 | 10.18653/V1/2020.FINDINGS-EMNLP.58 | EMNLP |
DocType | Volume | Citations |
Conference | 2020.findings-emnlp | 0 |
PageRank | References | Authors |
0.34 | 19 | 6 |
Name | Order | Citations | PageRank |
---|---|---|---|
Yiming Cui | 1 | 87 | 13.40 |
Wanxiang Che | 2 | 711 | 66.39 |
Ting Liu | 3 | 2735 | 232.31 |
Bing Qin | 4 | 1076 | 72.82 |
Shijin Wang | 5 | 180 | 31.56 |
Guoping Hu | 6 | 309 | 37.32 |