Improving condition severity classification with an efficient active learning based framework. - Citegraph

Paper Info

Title
Improving condition severity classification with an efficient active learning based framework.

Abstract
Classification of condition severity can be useful for discriminating among sets of conditions or phenotypes, for example when prioritizing patient care or for other healthcare purposes. Electronic Health Records (EHRs) represent a rich source of labeled information that can be harnessed for severity classification. The labeling of EHRs is expensive and in many cases requires employing professionals with high level of expertise. In this study, we demonstrate the use of Active Learning (AL) techniques to decrease expert labeling efforts. We employ three AL methods and demonstrate their ability to reduce labeling efforts while effectively discriminating condition severity. We incorporate three AL methods into a new framework based on the original CAESAR (Classification Approach for Extracting Severity Automatically from Electronic Health Records) framework to create the Active Learning Enhancement framework (CAESAR-ALE). We applied CAESAR-ALE to a dataset containing 516 conditions of varying severity levels that were manually labeled by seven experts. Our dataset, called the "CAESAR dataset," was created from the medical records of 1.9 million patients treated at Columbia University Medical Center (CUMC). All three AL methods decreased labelers' efforts compared to the learning methods applied by the original CAESER framework in which the classifier was trained on the entire set of conditions; depending on the AL strategy used in the current study, the reduction ranged from 48% to 64% that can result in significant savings, both in time and money. As for the PPV (precision) measure, CAESAR-ALE achieved more than 13% absolute improvement in the predictive capabilities of the framework when classifying conditions as severe. These results demonstrate the potential of AL methods to decrease the labeling efforts of medical experts, while increasing accuracy given the same (or even a smaller) number of acquired conditions. We also demonstrated that the methods included in the CAESAR-ALE framework (Exploitation and Combination_XA) are more robust to the use of human labelers with different levels of professional expertise.

Year	DOI	Venue
2016	10.1016/j.jbi.2016.03.016	Journal of Biomedical Informatics
Keywords	Field	DocType
Active Learning,Condition,Electronic Health Records,Phenotyping,Severity	Data mining,Active learning,Computer science,Support vector machine,Automation,Data curation,Artificial intelligence,SNOMED CT,Classifier (linguistics),Problem-based learning,Machine learning,Test set	Journal
Volume	Issue	ISSN
61	C	1532-0464
Citations	PageRank	References
5	0.42	30
Authors
7

Authors (7 rows)

Cited by (5 rows)

References (30 rows)

Name	Order	Citations	PageRank
Nir Nissim	1	199	19.42
Mary Regina Boland	2	100	8.63
Nicholas P. Tatonetti	3	98	7.37
Yuval Elovici	4	2583	204.53
George Hripcsak	5	1493	160.86
Yuval Shahar	6	1974	214.22
Robert Moskovitch	7	729	39.62

1