Title
Context-aware multi-token concept recognition of biological entities
Abstract
Background Concept recognition is a term that corresponds to the two sequential steps of named entity recognition and named entity normalization, and plays an essential role in the field of bioinformatics. However, the conventional dictionary-based methods did not sufficiently addressed the variation of the concepts in actual use in literature, resulting in the particularly degraded performances in recognition of multi-token concepts. Results In this paper, we propose a concept recognition method of multi-token biological entities using neural models combined with literature contexts. The key aspect of our method is utilizing the contextual information from the biological knowledge-bases for concept normalization, which is followed by named entity recognition procedure. The model showed improved performances over conventional methods, particularly for multi-token concepts with higher variations. Conclusions We expect that our model can be utilized for effective concept recognition and variety of natural language processing tasks on bioinformatics.
Year
DOI
Venue
2021
10.1186/s12859-021-04248-8
BMC BIOINFORMATICS
Keywords
DocType
Volume
BERT, Concept recognition, Entity normalization, Gene ontology
Journal
22
Issue
ISSN
Citations 
SUPPL 11
1471-2105
0
PageRank 
References 
Authors
0.34
0
2
Name
Order
Citations
PageRank
Kwangmin Kim174.28
Doheon Lee202.03