Title
Necos: An Annotated Corpus To Identify Constructive News Comments In Spanish
Abstract
In this paper, we present the NEws and COmments in Spanish (NECOS) corpus, a collection of Spanish comments posted in response to newspaper articles. Following a robust annotation scheme, three annotators labeled the comments as constructive and non-constructive. The articles were published in the newspaper El Mundo between April 3rd and April 30th, 2018. The corpus is composed of a total of 10 news articles and 1,419 comments. Three annotators manually labeled NECOS with an average Cohen's kappa of 78.97. Our current focus is the study of constructiveness and the evaluation of the Spanish NECOS corpus. In order to address this goal, we propose a benchmark testing different machine learning systems based on Natural Language Processing: a traditional system and the novel Transformer-based models. Specifically, we compare multilingual models with a monolingual model trained on Spanish in order to highlight the need to create resources trained on a specific language. The monolingual model fine-tuning on NECOS obtain the best result by achieving a macro-average F-1 score of 77.24%.
Year
DOI
Venue
2021
10.26342/2021-66-3
PROCESAMIENTO DEL LENGUAJE NATURAL
Keywords
DocType
Volume
Corpora, constructiveness, Natural Language Processing, Transformerbased models
Journal
66
Issue
ISSN
Citations 
66
1135-5948
0
PageRank 
References 
Authors
0.34
0
4