Title
BERTweet: A pre-trained language model for English Tweets
Abstract
We present BERTweet, the first public large-scale pre-trained language model for English Tweets. Our BERTweet is trained using the RoBERTa pre-training procedure (Liu et al., 2019), with the same model configuration as BERT-base (Devlin et al., 2019). Experiments show that BERTweet outperforms strong baselines RoBERTa-base and XLM-R-base (Conneau et al., 2020), producing better performance results than the previous state-of-the-art models on three Tweet NLP tasks: Part-of-speech tagging, Named-entity recognition and text classification. We release BERTweet to facilitate future research and downstream applications on Tweet data. Our BERTweet is available at: https://github.com/VinAIResearch/BERTweet
Year
DOI
Venue
2020
10.18653/V1/2020.EMNLP-DEMOS.2
EMNLP
DocType
Volume
Citations 
Conference
2020.emnlp-demos
0
PageRank 
References 
Authors
0.34
0
3
Name
Order
Citations
PageRank
Dat Quoc Nguyen124625.87
Thanh Vu2406.87
Tuan Nguyen Anh315.09