SINAI@SMM4H'22: Transformers for biomedical social media text mining in Spanish. - Citegraph

Paper Info

Title
SINAI@SMM4H'22: Transformers for biomedical social media text mining in Spanish.

Abstract
This paper covers participation of the SINAI team in Tasks 5 and 10 of the Social Media Mining for Health (#SSM4H) workshop at COLING-2022. These tasks focus on leveraging Twitter posts written in Spanish for healthcare research. The objective of Task 5 was to classify tweets reporting COVID-19 symptoms, while Task 10 required identifying disease mentions in Twitter posts. The presented systems explore large RoBERTa language models pre-trained on Twitter data in the case of tweet classification task and general-domain data for the disease recognition task. We also present a text pre-processing methodology implemented in both systems and describe an initial weakly-supervised fine-tuning phase alongside with a submission post-processing procedure designed for Task 10. The systems obtained 0.84 F1-score on the Task 5 and 0.77 F1-score on Task 10.

Year	Venue	DocType
2022	International Conference on Computational Linguistics	Conference
Citations	PageRank	References
0	0.34	0
Authors
5

Authors (5 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Mariia Chizhikova	1	0	0.34
Pilar López-Úbeda	2	0	6.76
Manuel Carlos Díaz-Galiano	3	35	21.69
Luis Alfonso Ureña López	4	257	53.93
Maite Martín-Valdivia	5	25	6.80

1