Title
TREC-10 Web Track Experiments at MSRA
Abstract
In TREC-10, Microsoft Research Asia (MSRA) participated in the Web track (ad hoc retrieval task and homepage finding task). The latest version of the Okapi system (Windows 2000 version) was used. We focused on the developing of content-based retrieval and link- based retrieval, and investigated the suitable combination of the two. For content-based retrieval, we examined the problems of weighting scheme, re-weighting and pseudo-relevance feedback (PRF). Then we developed a method called collection refinement (CE) for QE. We investigated the use of two kinds of link information, link anchor and link structure. We used anchor descriptions instead of content text to build index. Furthermore, different search strategies, such as spreading activation and PageRank, have been tested. Experimental results show: (1) Okapi system is robust and effective for web retrieval. (2) In ad hoc task, content-based retrieval achieved much better performance, and the impact of anchor text can be neglected; while for homepage finding task, both anchor text and content text provide useful information contributing more on precision and recall respectively. (3) Although query expansion does not show any improvement in our web retrieval experiments, we believe that there are still potential for CE.
Year
Venue
Keywords
2001
TREC
anchor text,indexation,query expansion,spreading activation
Field
DocType
Citations 
PageRank,Web search query,Data mining,Weighting,Web retrieval,Query expansion,Information retrieval,Computer science,Precision and recall,Anchor text
Conference
8
PageRank 
References 
Authors
0.84
10
8
Name
Order
Citations
PageRank
Jianfeng Gao15729296.43
Guihong Cao281539.28
Hongzhao He3744.92
Min Zhang41658134.93
Jian-yun Nie53681238.61
stephen j walker680.84
STEPHEN ROBERTSON76204669.07
universite de montreal8153.20