Title | ||
---|---|---|
Mining 100 million notes to find homelessness and adverse childhood experiences: 2 case studies of rare and severe social determinants of health in electronic health records. |
Abstract | ||
---|---|---|
Understanding how to identify the social determinants of health from electronic health records (EHRs) could provide important insights to understand health or disease outcomes. We developed a methodology to capture 2 rare and severe social determinants of health, homelessness and adverse childhood experiences (ACEs), from a large EHR repository. We first constructed lexicons to capture homelessness and ACE phenotypic profiles. We employed word2vec and lexical associations to mine homelessness-related words. Next, using relevance feedback, we refined the 2 profiles with iterative searches over 100 million notes from the Vanderbilt EHR. Seven assessors manually reviewed the top-ranked results of 2544 patient visits relevant for homelessness and 1000 patients relevant for ACE. word2vec yielded better performance (area under the precision-recall curve [AUPRC] of 0.94) than lexical associations (AUPRC = 0.83) for extracting homelessness-related words. A comparative study of searches for the 2 phenotypes revealed a higher performance achieved for homelessness (AUPRC = 0.95) than ACE (AUPRC = 0.79). A temporal analysis of the homeless population showed that the majority experienced chronic homelessness. Most ACE patients suffered sexual (70%) and/or physical (50.6%) abuse, with the top-ranked abuser keywords being "father" (21.8%) and "mother" (15.4%). Top prevalent associated conditions for homeless patients were lack of housing (62.8%) and tobacco use disorder (61.5%), while for ACE patients it was mental disorders (36.6%-47.6%). We provide an efficient solution for mining homelessness and ACE information from EHRs, which can facilitate large clinical and genetic studies of these social determinants of health. |
Year | DOI | Venue |
---|---|---|
2018 | 10.1093/jamia/ocx059 | JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION |
Keywords | Field | DocType |
text mining,homelessness,adverse childhood experiences,social determinants of health,EHR | Adverse Childhood Experiences,Environmental health,Medicine,Social determinants of health | Journal |
Volume | Issue | ISSN |
25 | 1 | 1067-5027 |
Citations | PageRank | References |
3 | 0.45 | 20 |
Authors | ||
12 |
Name | Order | Citations | PageRank |
---|---|---|---|
Cosmin Adrian Bejan | 1 | 21 | 2.14 |
John Angiolillo | 2 | 3 | 0.79 |
Douglas Conway | 3 | 3 | 0.45 |
Robertson Nash | 4 | 3 | 0.45 |
Jana Shirey-Rice | 5 | 15 | 3.51 |
Loren Lipworth-Elliot | 6 | 3 | 0.45 |
Robert M Cronin | 7 | 63 | 9.71 |
Jill M. Pulley | 8 | 126 | 12.84 |
Sunil Kripalani | 9 | 3 | 0.45 |
Shari Barkin | 10 | 3 | 1.13 |
Kevin B. Johnson | 11 | 337 | 39.11 |
Joshua C. Denny | 12 | 932 | 97.43 |