Title
Natural language processing methods for enhancing geographic metadata for phylogeography of zoonotic viruses.
Abstract
Zoonotic viruses represent emerging or re-emerging pathogens that pose significant public health threats throughout the world. It is therefore crucial to advance current surveillance mechanisms for these viruses through outlets such as phylogeography. Despite the abundance of zoonotic viral sequence data in publicly available databases such as GenBank, phylogeographic analysis of these viruses is often limited by the lack of adequate geographic metadata. However, many GenBank records include references to articles with more detailed information and automated systems may help extract this information efficiently and effectively. In this paper, we describe our efforts to determine the proportion of GenBank records with "insufficient" geographic metadata for seven well-studied viruses. We also evaluate the performance of four different Named Entity Recognition (NER) systems for automatically extracting related entities using a manually created gold-standard.
Year
Venue
Keywords
2014
BioNLP@ACL
bioinformatics,biomedical research
Field
DocType
Volume
Phylogeography,Data mining,World Wide Web,Geospatial metadata,Data sequences,GenBank,Medicine,Named-entity recognition
Conference
2014
ISSN
Citations 
PageRank 
2153-4063
1
0.38
References 
Authors
5
7
Name
Order
Citations
PageRank
Tasnia Tahsin1303.28
Rachel Beard2121.77
Robert Rivera3121.77
Rob Lauder410.38
Garrick Wallstrom510.38
Matthew Scotch610.38
Graciela Gonzalez710.38