On the impact of knowledge-based linguistic annotations in the quality of scientific embeddings - Citegraph

Paper Info

Title
On the impact of knowledge-based linguistic annotations in the quality of scientific embeddings

Abstract
In essence, embedding algorithms work by optimizing the distance between a word and its usual context in order to generate an embedding space that encodes the distributional representation of words. In addition to single words or word pieces, other features which result from the linguistic analysis of text, including lexical, grammatical and semantic information, can be used to improve the quality of embedding spaces. However, until now we did not have a precise understanding of the impact that such individual annotations and their possible combinations may have in the quality of the embeddings. In this paper, we conduct a comprehensive study on the use of explicit linguistic annotations to generate embeddings from a scientific corpus and quantify their impact in the resulting representations. Our results show how the effect of such annotations in the embeddings varies depending on the evaluation task. In general, we observe that learning embeddings using linguistic annotations contributes to achieve better evaluation results.

Year	DOI	Venue
2021	10.1016/j.future.2021.02.019	Future Generation Computer Systems
Keywords	DocType	Volume
Natural language processing,Linguistic analysis,Knowledge graphs,Embeddings	Journal	120
ISSN	Citations	PageRank
0167-739X	1	0.35
References	Authors
0	3

Authors (3 rows)

Cited by (1 rows)

References (0 rows)

Name	Order	Citations	PageRank
Andres Garcia-Silva	1	1	0.35
Ronald Denaux	2	153	14.39
José Manuél Gómez-Pérez	3	9	6.77

1