Improving Medical Code Prediction from Clinical Text via Incorporating Online Knowledge Sources - Citegraph

Paper Info

Title
Improving Medical Code Prediction from Clinical Text via Incorporating Online Knowledge Sources

Abstract
Clinical notes contain detailed information about health status of patients for each of their encounters with a health system. Developing effective models to automatically assign medical codes to clinical notes has been a long-standing active research area. Despite a great recent progress in medical informatics fueled by deep learning, it is still a challenge to find the specific piece of evidence in a clinical note which justifies a particular medical code out of all possible codes. Considering the large amount of online disease knowledge sources, which contain detailed information about signs and symptoms of different diseases, their risk factors, and epidemiology, there is an opportunity to exploit such sources. In this paper we consider Wikipedia as an external knowledge source and propose Knowledge Source Integration (KSI), a novel end-to-end code assignment framework, which can integrate external knowledge during training of any baseline deep learning model. The main idea of KSI is to calculate matching scores between a clinical note and disease related Wikipedia documents, and combine the scores with output of the baseline model. To evaluate KSI, we experimented with automatic assignment of ICD-9 diagnosis codes to the emergency department clinical notes from MIMIC-III data set, aided by Wikipedia documents corresponding to the ICD-9 codes. We evaluated several baseline models, ranging from logistic regression to recently proposed deep learning models known to achieve the state-of-the-art accuracy on clinical notes. The results show that KSI consistently improves the baseline models and that it is particularly successful in assignment of rare codes. In addition, by analyzing weights of KSI models, we can gain understanding about which words in Wikipedia documents provide useful information for predictions.

Year	DOI	Venue
2019	10.1145/3308558.3313485	WWW '19: The Web Conference on The World Wide Web Conference WWW 2019
Keywords	Field	DocType
Multi-label classification, attention mechanism, document similarity learning, healthcare	Health care,Medical classification,Diagnosis code,Information retrieval,Computer science,Code assignment,Exploit,Multi-label classification,Artificial intelligence,Deep learning,Health informatics,Machine learning	Conference
ISBN	Citations	PageRank
978-1-4503-6674-8	1	0.41
References	Authors
0	2

Authors (2 rows)

Cited by (1 rows)

References (0 rows)

Name	Order	Citations	PageRank
Tian Bai	1	16	3.40
Slobodan Vucetic	2	637	56.38

1