Abstract | ||
---|---|---|
Pre-trained language models have brought significant improvements in performance in a variety of natural language processing tasks. Most existing models performing state-of-theart results have shown their approaches in the separate perspectives of data processing, pretraining tasks, neural network modeling, or fine-tuning. In this paper, we demonstrate how the approaches affect performance individually, and that the language model performs the best results on a specific question answering task when those approaches are jointly considered in pre-training models. In particular, we propose an extended pre-training task, and a new neighbor-aware mechanism that attends neighboring tokens more to capture the richness of context for pre-training language modeling. Our best model achieves new state-of-the-art results of 95.7% F1 and 90.6% EM on SQuAD 1.1 and also outperforms existing pre-trained language models such as RoBERTa, ALBERT, ELECTRA, and XLNet on the SQuAD 2.0 benchmark. |
Year | DOI | Venue |
---|---|---|
2022 | 10.18653/v1/2022.repl4nlp-1.13 | PROCEEDINGS OF THE 7TH WORKSHOP ON REPRESENTATION LEARNING FOR NLP |
DocType | Volume | ISSN |
Conference | Proceedings of the 7th Workshop on Representation Learning for NLP | ACL 2022 Workshop RepL4NLP Submission |
Citations | PageRank | References |
0 | 0.34 | 0 |
Authors | ||
7 |
Name | Order | Citations | PageRank |
---|---|---|---|
Changwook Jun | 1 | 0 | 0.34 |
Hansol Jang | 2 | 0 | 0.34 |
Myoseop Sim | 3 | 0 | 0.34 |
Hyun Kim | 4 | 0 | 0.34 |
Jooyoung Choi | 5 | 0 | 0.34 |
Kyungkoo Min | 6 | 0 | 0.34 |
Kyunghoon Bae | 7 | 0 | 0.34 |