Title
Inductive Identification Of Functional Status Information And Establishing A Gold Standard Corpus A Case Study On The Mobility Domain
Abstract
The importance of functional status information (FSI) has become increasingly evident in recent years [1, 2]. However, implementation, application, and normalization of FSI in health care and Electronic Health Records (EHRs) have been largely underexplored. The World Health Organization's International Classification of Functioning, Disability and Health (ICF) [3] is considered to be the international standard for describing and coding function and health states. Nevertheless, the ICF provides only a limited vocabulary for recognizing FSI descriptions, since its purpose is to organize concepts related to functioning rather than to provide a comprehensive terminology or a complete set of relations between concepts. While the free text portion of EHRs might provide a more complete picture of health status, treatment, and progress, current Natural Language Processing (NLP) methods largely focus on extracting medical conditions (e.g. diagnoses and symptoms, etc.). The absence of a standardized functional terminology and incompleteness of the ICF as a vocabulary source makes it challenging to build a NLP system to extract FSI from EHR free text.Our work takes the first step towards extraction of FSI from free text by systematically identifying the structure of FSI related to Mobility, a key domain of the ICF and an important domain in the determination of work disability. Our interdisciplinary research group inductively evaluated examples extracted from over 1,200 Physical Therapy (PT) notes from the Clinical Center of the National Institutes of Health (NIH). This extensive work resulted in a nested entity structure comprised of 2 entities, 3 sub-entities, 8 attributes, and 21 attribute values. Furthermore, we have manually curated the first gold standard corpus of 200 double-annotated and 50 triple-annotated PT notes. Our inter-annotator agreement (IAA) averages 97% F1-score on partial textual span matching and from 0.4 to 0.9 Siegel & Castellan's kappa on attribute value matching. Such a rich semantic corpus of Mobility FSI is valuable and a promising resource for future statistical learning. Our method is also adaptable to other domains of the ICF.
Year
DOI
Venue
2017
10.1109/BIBM.2017.8218042
2017 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM)
Keywords
DocType
ISSN
functional status information, functioning, ICF, natural language processing, manual curation, annotation, physical therapy
Conference
2156-1125
Citations 
PageRank 
References 
0
0.34
0
Authors
13
Name
Order
Citations
PageRank
Thanh Thieu100.34
Jonathan Camacho200.34
Pei-Shu Ho300.34
Julia Porcino402.37
Min Ding500.34
Lisa Nelson600.34
Elizabeth Rasch711.73
Chunxiao Zhou800.34
Leighton Chan921.41
Denis Newman-Griffis1022.08
Diane Brandt1100.68
Ao Yuan1200.68
Albert M. Lai1323828.46