Title
Fast medical concept normalization for biomedical literature based on stack and index optimized self-attention
Abstract
Medical concept normalization aims to construct a semantic mapping between mentions and concepts and to uniformly represent mentions that belong to the same concept. In the large-scale biological literature database, a fast concept normalization method is essential to process a large number of requests and literature. To this end, we propose a hierarchical concept normalization method, named FastMCN, with much lower computational cost and a variant of transformer encoder, named stack and index optimized self-attention (SISA), to improve the efficiency and performance. In training, FastMCN uses SISA as a word encoder to encode word representations from character sequences and uses a mention encoder which summarizes the word representations to represent a mention. In inference, FastMCN indexes and summarizes word representations to represent a query mention and output the concept of the mention which with the maximum cosine similarity. To further improve the performance, SISA was pre-trained using the continuous bag-of-words architecture with 18.6 million PubMed abstracts. All experiments were evaluated on two publicly available datasets: NCBI disease and BC5CDR disease. The results showed that SISA was three times faster than the transform encoder for encoding word representations and had better performance. Benefiting from SISA, FastMCN was efficient in both training and inference, i.e. it achieved the peak performance of most of the baseline methods within 30 s and was 3000–5600 times faster than the state-of-the-art method in inference.
Year
DOI
Venue
2022
10.1007/s00521-022-07228-y
Neural Computing and Applications
Keywords
DocType
Volume
Medical concept normalization, Transformer, Distributed word embedding
Journal
34
Issue
ISSN
Citations 
19
0941-0643
0
PageRank 
References 
Authors
0.34
14
8
Name
Order
Citations
PageRank
Liang Likeng100.34
Tianyong Hao25413.89
Zhan Choujun300.34
Qiu Hong400.34
Fu Lee Wang5926118.55
Jun Yan6179885.25
Weng Heng700.34
Qu Yingying800.34