Title
Identifying gene and protein names from biological texts
Abstract
Extracting and identifying gene and protein names from literature is a critical step for mining functional information of genes and proteins. While extensive efforts have been devoted to this important task, most of them were aiming at extracting gene/protein name per se without paying much attention to associate the extracted name with existing gene and protein database entries. We developed a simple and efficient method to identify gene and protein names in literature using a combination of heuristic and statistical strategies. Our approach will map the extracted names to individual LocusLink entries thus enable the seamless integration of literature information with existing gene/protein databases. Evaluation on a test corpus shows that our method can achieve both high recall and precision. Our method exhibits good performance and can be used as a building block for large biomedical literature mining systems.
Year
DOI
Venue
2003
10.1109/CSB.2003.1227431
CSB
Keywords
Field
DocType
heuristic strategies,method canachieve,combination ofheuristic,gene database entries,protein database entries,gene functional information mining,identification,gene name extraction,genetics,literature,large biomedical literature mining,proteins,protein name extraction,mining functionalinformation,biology computing,gene name identification,proteins functional information mining,statistical strategies,protein names identification,data mining,test corpus,approach willmap,biological texts,literature information seamless integration,protein namesfrom literature,efficient method,protein names,critical step,geneand protein name,protein database entry,identifying gene,biomedical literature mining systems,locuslink entries
Heuristic,Gene,Computer science,Precision and recall,Bioinformatics,Protein Databases
Conference
ISBN
Citations 
PageRank 
0-7695-2000-6
4
0.60
References 
Authors
9
4
Name
Order
Citations
PageRank
Weijian Xuan1574.23
Stanley J. Watson240.60
Huda Akil3714.79
Fan Meng411410.82