Abstract | ||
---|---|---|
The goal of gene normalization (GN) is to identify the unique database identifiers of genes and proteins mentioned in biomedical literature. A major difficulty in GN comes from inter-species gene ambiguity. That is, the same gene name can refer to different database identifiers depending on the species in question. In this paper, we introduce a method to exploit contextual information in an abstract, like tissue type, chromosome location, etc., to tackle this problem. Using this technique, we have been able to improve system performance (F-score) by 14.3% on the BioCreAtIvE-II GN task test set. |
Year | DOI | Venue |
---|---|---|
2009 | 10.1109/IRI.2009.5211619 | IRI |
Keywords | Field | DocType |
different database,database management systems,chromosome location,genetics,biocreative-ii gn task test set,gene name,contextual information,major difficulty,biology computing,inter-species gene ambiguity,gene normalization,unique database identifiers,gene normalization ambiguity,biomedical literature,biocreative-ii gn task test,databases,proteins,system performance,dictionaries,data mining | Data mining,Contextual information,Gene normalization,Identifier,Information retrieval,Computer science,Exploit,Ambiguity,Test set | Conference |
ISBN | Citations | PageRank |
978-1-4244-4116-7 | 9 | 0.59 |
References | Authors | |
3 | 6 |
Name | Order | Citations | PageRank |
---|---|---|---|
Po-Ting Lai | 1 | 130 | 9.32 |
Yue-Yang Bow | 2 | 49 | 2.34 |
Chi-Hsin Huang | 3 | 89 | 3.99 |
Hong-Jie Dai | 4 | 288 | 21.58 |
Richard Tzong-Han Tsai | 5 | 714 | 54.89 |
Wen-Lian Hsu | 6 | 1701 | 198.40 |