Title
A Text-Mining System for Concept Annotation in Biomedical Full Text Articles.
Abstract
PubTator Central (https://www.ncbi.nlm.nih.gov/research/pubtator/) [1] is a web service for exploring and retrieving bioconcept annotations in full text biomedical articles. PubTator Central (PTC) provides automated annotations from state-of-the-art text mining systems for genes/proteins, genetic variants, diseases, chemicals, species and cell lines, all available for immediate download. PTC annotates PubMed (30 million abstracts), the PMC Open Access Subset and the Author Manuscript Collection (3 million full text articles). These full text articles increase the total number of annotations nearly four-fold. The new PTC web interface features semantic search and faceted shortcuts to improve navigation in full text. Increased throughput and speed despite a huge increase in data volume is permitted by a significantly redesigned back end that heavily exploits nonrelational data. Updated entity identification methods and a new disambiguation module based on cutting-edge deep learning techniques provide increased accuracy. The PTC web interface allows users to easily navigate through bioentities present in full-text articles, build full text document collections and visualize concept annotations in each document. Annotations are downloadable in multiple formats (XML, JSON and tab delimited) via the online interface, a RESTful web service and bulk FTP. PTC is synchronized with PubMed and PubMed Central, with new articles added daily. The original PubTator [2] service has served annotated abstracts for ~300 million requests, enabling third-party research in use cases such as biocuration support, gene prioritization, genetic disease analysis, and literature-based knowledge discovery. We demonstrate the full text results in PTC significantly increase biomedical concept coverage and anticipate this expansion will both enhance existing downstream applications and enable new use cases.
Year
DOI
Venue
2019
10.1145/3307339.3343246
BCB
Field
DocType
ISBN
Text mining,Annotation,Information retrieval,Computer science,Artificial intelligence,Machine learning
Conference
978-1-4503-6666-3
Citations 
PageRank 
References 
0
0.34
0
Authors
4
Name
Order
Citations
PageRank
Chih-Hsuan Wei154627.43
Alexis Allot261.51
Robert Leaman391439.98
Zhiyong Lu42735171.27