A Concept-Based Framework for Passage Retrieval at Genomics - Citegraph

Paper Info

Title
A Concept-Based Framework for Passage Retrieval at Genomics

Abstract
The task of TREC 2006 Genomics Track is to retrieve passages (from part to paragraph) from full-text HTML biomedical journal papers to answer the structured ques- tions from real biologists. A system for such task needs to be able to parse the HTML free-texts (convert the HTML free-texts into plain texts) and pinpoint the most relevant passage(s) within documents for the specified question. This task is accomplished in three steps in our system. The first step is to parse the HTML articles and partition them into paragraphs. The second step is to retrieve the relevant paragraphs. The third step is to identify the most relevant passages within paragraphs and finally rank those passages. We are interested in 1. How does a con- cept-based IR model perform on structured queries com- paring to Okapi? 2. Will the query expansion based on domain knowledge increase retrieval effectiveness? 3. Will our abbreviation database from MEDLINE help im- prove query expansion and will the abbreviation disam- biguation help improve the ranking? The experiment re- sults show that our concept-based IR model works better than the Okapi; query expansion based on domain knowl- edge is important, especially for those queries with very few relevant documents; an abbreviation database for query expansion and disambiguation is helpful for passage retrieval.

Year	Venue	Keywords
2006	TREC	domain knowledge,query expansion
Field	DocType	Citations
Data mining,Query expansion,Information retrieval,Computer science,Genomics,Natural language processing,Artificial intelligence	Conference	12
PageRank	References	Authors
0.71	8	4

Authors (4 rows)

Cited by (12 rows)

References (8 rows)

Name	Order	Citations	PageRank
Wei Zhou	1	112	27.01
Clement T. Yu	2	3171	1419.96
Vetle I. Torvik	3	430	27.15
Neil R. Smalheiser	4	658	57.50

1