Title
GELL: Automatic Extraction of Epidemiological Line Lists from Open Sources
Abstract
Real-time monitoring and responses to emerging public health threats rely on the availability of timely surveillance data. During the early stages of an epidemic, the ready availability of line lists with detailed tabular information about laboratory-confirmed cases can assist epidemiologists in making reliable inferences and forecasts. Such inferences are crucial to understand the epidemiology of a specific disease early enough to stop or control the outbreak. However, construction of such line lists requires considerable human supervision and therefore, difficult to generate in real-time. In this paper, we motivate Guided Epidemiological Line List (GELL), the first tool for building automated line lists (in near real-time) from open source reports of emerging disease outbreaks. Specifically, we focus on deriving epidemiological characteristics of an emerging disease and the affected population from reports of illness. GELL uses distributed vector representations (ala word2vec) to discover a set of indicators for each line list feature. This discovery of indicators is followed by the use of dependency parsing based techniques for final extraction in tabular form. We evaluate the performance of GELL against a human annotated line list provided by HealthMap corresponding to MERS outbreaks in Saudi Arabia. We demonstrate that GELL extracts line list features with increased accuracy compared to a baseline method. We further show how these automatically extracted line list features can be used for making epidemiological inferences, such as inferring demographics and symptoms-to-hospitalization period of affected individuals.
Year
DOI
Venue
2017
10.1145/3097983.3098073
KDD
Keywords
Field
DocType
Automated Line Listing,GELL,Word Embeddings,Dependency Parsing,Negation Detection
Population,Data mining,Computer science,Epidemiology,Dependency grammar,Information extraction,Demographics,Artificial intelligence,Word2vec,Disease early,Machine learning
Conference
ISBN
Citations 
PageRank 
978-1-4503-4887-4
0
0.34
References 
Authors
13
8
Name
Order
Citations
PageRank
Saurav Ghosh131411.99
Prithwish Chakraborty217917.22
Bryan Lewis313815.16
Maimuna S. Majumder401.35
Emily Cohn572.21
John S Brownstein619121.62
Madhav Marathe72775262.17
Naren Ramakrishnan81913176.25