Title
Clinical Phrase Mining with Language Models.
Abstract
A vast amount of vital clinical data is available within unstructured texts such as discharge summaries and procedure notes in Electronic Medical Records (EMRs). Automatically transforming such unstructured data into structured units is crucial for effective data analysis in the field of clinical informatics. Recognizing phrases that reveal important medical information in a concise and thorough manner is a fundamental step in this process. Existing systems that are built for opendomain texts are designed to detect mostly non-medical phrases, while tools designed specifically for extracting concepts from clinical texts are not scalable to large corpora and often leave out essential context surrounding those detected clinical concepts. We address these issues by proposing a framework, CliniPhrase, which adapts domain-specific deep neural network based language models (such as ClinicalBERT) to effectively and efficiently extract high-quality phrases from clinical documents with a limited amount of training data. Experimental results on the MIMIC-III dataset show that our method can outperform the current state-of-the-art techniques by up to 18% in terms of F 1 measure while being very efficient (up to 48 times faster).11Our source code, pre-trained models and documentations are available online at: https://github.com/kaushikmani/PhraseMiningLM
Year
DOI
Venue
2020
10.1109/BIBM49941.2020.9313496
BIBM
DocType
Citations 
PageRank 
Conference
0
0.34
References 
Authors
0
6
Name
Order
Citations
PageRank
Kaushik Mani100.34
Xiang Yue234.78
Bernal Jimenez Gutierrez300.34
Yungui Huang494.21
Simon M. Lin536637.72
Huan Sun633334.97