Title
Word sense disambiguation - an evaluation study of semi-supervised approaches with word embeddings.
Abstract
Word Sense Disambiguation (WSD) is a well-known problem in the field of Natural Language Processing (NLP) related to automatically determining the most appropriate sense of words in context. Several machine learning-based approaches have been proposed to tackle the ambiguity of language, but the lack of labeled data to train supervised models made semi-supervised learning (SSL) appear as an attractive option. Furthermore, the use of word embeddings to enhance the results of NLP tasks was shown to be an efficient strategy. Thus, this paper aims at adapting semi-supervised algorithms for WSD using word embeddings from Word2Vec, FastText, and BERT models combined with part-of-speech tags as input. We conduct a systematic evaluation of four graph-based SSL models analyzing the influence of their hyperparameters on the results, as well as the distances to build the graphs, the percentages of labeled data, and the word embeddings architectural variations. As a result, we show that SSL algorithms which received 10% of labeled data are strong baselines on the subsets of nouns and adjectives. Additionally, these algorithms do not need further training to disambiguate new words, hence being competitive to supervised systems.
Year
DOI
Venue
2020
10.1109/IJCNN48605.2020.9207225
IJCNN
DocType
Citations 
PageRank 
Conference
0
0.34
References 
Authors
0
3
Name
Order
Citations
PageRank
Samuel Bruno da Silva Sousa100.34
Evangelos Milios23073360.46
Lilian Berton3167.82