Abstract | ||
---|---|---|
Good automatic information extraction tools offer hope for automatic processing of the exploding biomedical literature, and successful named entity recognition is a key component for such tools.We present a maximum-entropy based system incorporating a diverse set of features for identifying gene and protein names in biomedical abstracts.This system was entered in the BioCreative comparative evaluation and achieved a precision of 0.83 and recall of 0.84 in the "open" evaluation and a precision of 0.78 and recall of 0.85 in the "closed" evaluation.Central contributions are rich use of features derived from the training data at multiple levels of granularity, a focus on correctly identifying entity boundaries, and the innovative use of several external knowledge sources including full MEDLINE abstracts and web searches. |
Year | DOI | Venue |
---|---|---|
2005 | 10.1186/1471-2105-6-S1-S5 | BMC Bioinformatics |
Keywords | Field | DocType |
microarrays,bioinformatics,maximum entropy,algorithms | Information retrieval,Protein identification,Computer science,Information extraction,Bioinformatics,Automatic processing,Named-entity recognition | Journal |
Volume | Issue | ISSN |
6 | S1 | 1471-2105 |
Citations | PageRank | References |
61 | 4.68 | 21 |
Authors | ||
6 |
Name | Order | Citations | PageRank |
---|---|---|---|
Jenny Rose Finkel | 1 | 1275 | 68.58 |
Shipra Dingare | 2 | 155 | 11.59 |
Christopher D. Manning | 3 | 22579 | 1126.22 |
Malvina Nissim | 4 | 479 | 51.48 |
Beatrice Alex | 5 | 237 | 25.59 |
Claire Grover | 6 | 729 | 100.15 |