Title
Cross-lingual Candidate Search for Biomedical Concept Normalization.
Abstract
Biomedical concept normalization links concept mentions in texts to a semantically equivalent concept in a biomedical knowledge base. This task is challenging as concepts can have different expressions in natural languages, e.g. paraphrases, which are not necessarily all present in the knowledge base. Concept normalization of non-English biomedical text is even more challenging as non-English resources tend to be much smaller and contain less synonyms. To overcome the limitations of non-English terminologies we propose a cross-lingual candidate search for concept normalization using a character-based neural translation model trained on a multilingual biomedical terminology. Our model is trained with Spanish, French, Dutch and German versions of UMLS. The evaluation of our model is carried out on the French Quaero corpus, showing that it outperforms most teams of CLEF eHealth 2015 and 2016. Additionally, we compare performance to commercial translators on Spanish, French, Dutch and German versions of Mantra. Our model performs similarly well, but is free of charge and can be run locally. This is particularly important for clinical NLP applications as medical documents underlay strict privacy restrictions.
Year
Venue
Field
2018
arXiv: Computation and Language
Normalization (statistics),Terminology,Computer science,Semantic equivalence,Natural language,Artificial intelligence,Natural language processing,Knowledge base,Unified Medical Language System,Clef,German
DocType
Volume
Citations 
Journal
abs/1805.01646
0
PageRank 
References 
Authors
0.34
3
4
Name
Order
Citations
PageRank
Roland Roller1156.35
Madeleine Kittner201.69
Dirk Weissenborn314111.62
Ulf Leser42071174.23