Title | ||
---|---|---|
Knowledge-Intensive and Statistical Approaches to the Retrieval and Annotation of Genomics MEDLINE Citations |
Abstract | ||
---|---|---|
Retrieving and annotating relevant informa- tion sources in the genomics literature are dif- ficult but common tasks undertaken by biolo- gists. The research presented here addresses these issues by exploring methods for retriev- ing MEDLINE® citations that answer real bi- ologists' information needs and by addressing the initial tasks required to annotate MED- LINE citations having genomic content with terms from the Gene Ontology (GO). We ap- proached the retrieval task using two methods: aggressive, knowledge-intensive query expan- sion and text neighboring. Our approaches to the triage subtask for annotation consisted of traditional machine learning (ML) methods as well as a novel ML algorithm for thematic analysis. Finally, we used a statistical, n-gram heuristic to decide which of the GO hierar- chies should be used to annotate a given MEDLINE citation. |
Year | Venue | Keywords |
---|---|---|
2004 | TREC | medline,genomics,mesh,information need,machine learning |
Field | DocType | Citations |
Thematic analysis,Data mining,Heuristic,Information needs,Annotation,Information retrieval,Query expansion,Computer science,Citation,Triage,MEDLINE | Conference | 8 |
PageRank | References | Authors |
0.85 | 8 | 12 |
Name | Order | Citations | PageRank |
---|---|---|---|
Alan R. Aronson | 1 | 2551 | 260.67 |
Susanne M. Humphrey | 2 | 561 | 63.27 |
Nicholas C. Ide | 3 | 91 | 10.78 |
Won Kim | 4 | 173 | 171.15 |
Russell R. Loane | 5 | 30 | 3.36 |
James G. Mork | 6 | 647 | 65.22 |
Lawrence H. Smith | 7 | 196 | 14.48 |
Lorraine Tanabe | 8 | 383 | 29.80 |
W. John Wilbur | 9 | 424 | 43.91 |
Natalie Xie | 10 | 123 | 9.13 |
Dina Demner Fushman | 11 | 1717 | 147.70 |
Hongfang Liu | 12 | 1479 | 160.66 |