Title
Pharmspresso: a text mining tool for extraction of pharmacogenomic concepts and relationships from full text.
Abstract
Pharmacogenomics studies the relationship between genetic variation and the variation in drug response phenotypes. The field is rapidly gaining importance: it promises drugs targeted to particular subpopulations based on genetic background. The pharmacogenomics literature has expanded rapidly, but is dispersed in many journals. It is challenging, therefore, to identify important associations between drugs and molecular entities--particularly genes and gene variants, and thus these critical connections are often lost. Text mining techniques can allow us to convert the free-style text to a computable, searchable format in which pharmacogenomic concepts (such as genes, drugs, polymorphisms, and diseases) are identified, and important links between these concepts are recorded. Availability of full text articles as input into text mining engines is key, as literature abstracts often do not contain sufficient information to identify these pharmacogenomic associations.Thus, building on a tool called Textpresso, we have created the Pharmspresso tool to assist in identifying important pharmacogenomic facts in full text articles. Pharmspresso parses text to find references to human genes, polymorphisms, drugs and diseases and their relationships. It presents these as a series of marked-up text fragments, in which key concepts are visually highlighted. To evaluate Pharmspresso, we used a gold standard of 45 human-curated articles. Pharmspresso identified 78%, 61%, and 74% of target gene, polymorphism, and drug concepts, respectively.Pharmspresso is a text analysis tool that extracts pharmacogenomic concepts from the literature automatically and thus captures our current understanding of gene-drug interactions in a computable form. We have made Pharmspresso available at http://pharmspresso.stanford.edu.
Year
DOI
Venue
2009
10.1186/1471-2105-10-S2-S6
BMC Bioinformatics
Keywords
Field
DocType
pharmacogenetics,drug interaction,internet,genetic variation,gene polymorphism,computational biology,text mining,polymorphism,text analysis,gold standard,drug targeting
Data science,Regular expression,Text mining,Biology,Information extraction,Bioinformatics,Pharmacogenomics
Journal
Volume
Issue
ISSN
10 Suppl 2
S-2
1471-2105
Citations 
PageRank 
References 
54
1.78
10
Authors
2
Name
Order
Citations
PageRank
Yael Garten11838.73
Russ B. Altman22500456.07