Algorithm for grounding mutation mentions from text to protein sequences - Citegraph

Paper Info

Title
Algorithm for grounding mutation mentions from text to protein sequences

Abstract
Protein mutations derived from in vitro experimental analysis are described in detail in scientific papers. Reuse of mutation impact annotations is an important subfield of bioinformatics for which mutation grounding is a critical step. Presented here is a method for grounding of textual mentions from papers describing mutational changes to proteins. We distinguish between grounding of mutation entities to protein database identifiers and to the correct positions on sequences extracted from protein databases. The grounding workflow coordinates the extraction of mutation, protein and organism mentions from texts and uses these to identify target sequences. Mutation mentions are sequentially mapped onto candidate proteins to facilitate their correct grounding to a protein sequence, independent of a protein-mutation tuple extraction task. Using a gold standard corpus of full text articles and corresponding protein sequences we show high performance precision and recall and discuss novel aspects of the algorithm in the context of previous work.

Year	DOI	Venue
2010	10.1007/978-3-642-15120-0_10	DILS
Keywords	Field	DocType
protein sequence,sequence analysis,experimental analysis,gold standard,natural language processing	Data mining,Identifier,Protein sequencing,Tuple,Computer science,Precision and recall,Algorithm,Protein Databases,Mutation,Sequence analysis	Conference
Volume	ISSN	ISBN
6254	0302-9743	3-642-15119-1
Citations	PageRank	References
4	0.42	19
Authors
3

Authors (3 rows)

Cited by (4 rows)

References (19 rows)

Name	Order	Citations	PageRank
Jonas Bergman Laurila	1	18	2.11
K Rajaraman	2	380	31.94
christopher j o baker	3	329	30.96

1