Abstract | ||
---|---|---|
Extracting and identifying gene and protein names from literature is a critical step for mining functional information of genes and proteins. While extensive efforts have been devoted to this important task, most of them were aiming at extracting gene/protein name per se without paying much attention to associate the extracted name with existing gene and protein database entries. We developed a simple and efficient method to identify gene and protein names in literature using a combination of heuristic and statistical strategies. Our approach will map the extracted names to individual LocusLink entries thus enable the seamless integration of literature information with existing gene/protein databases. Evaluation on a test corpus shows that our method can achieve both high recall and precision. Our method exhibits good performance and can be used as a building block for large biomedical literature mining systems. |
Year | DOI | Venue |
---|---|---|
2003 | 10.1109/CSB.2003.1227431 | CSB |
Keywords | Field | DocType |
heuristic strategies,method canachieve,combination ofheuristic,gene database entries,protein database entries,gene functional information mining,identification,gene name extraction,genetics,literature,large biomedical literature mining,proteins,protein name extraction,mining functionalinformation,biology computing,gene name identification,proteins functional information mining,statistical strategies,protein names identification,data mining,test corpus,approach willmap,biological texts,literature information seamless integration,protein namesfrom literature,efficient method,protein names,critical step,geneand protein name,protein database entry,identifying gene,biomedical literature mining systems,locuslink entries | Heuristic,Gene,Computer science,Precision and recall,Bioinformatics,Protein Databases | Conference |
ISBN | Citations | PageRank |
0-7695-2000-6 | 4 | 0.60 |
References | Authors | |
9 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Weijian Xuan | 1 | 57 | 4.23 |
Stanley J. Watson | 2 | 4 | 0.60 |
Huda Akil | 3 | 71 | 4.79 |
Fan Meng | 4 | 114 | 10.82 |