Title
Towards effective strategies for monolingual and bilingual information retrieval: Lessons learned from NTCIR-4
Abstract
At the NTCIR-4 workshop, Justsystem Corporation (JSC) and Clairvoyance Corporation (CC) collaborated in the cross-language retrieval task (CLIR). Our goal was to evaluate the performance and robustness of our recently developed commercial-grade CLIR systems for English and Asian languages. The main contribution of this article is the investigation of different strategies, their interactions in both monolingual and bilingual retrieval tasks, and their respective contributions to operational retrieval systems in the context of NTCIR-4. We report results of Japanese and English monolingual retrieval and results of Japanese-to-English bilingual retrieval. In monolingual retrieval analysis, we examine two special properties of the NTCIR experimental design (two levels of relevance and identical queries in multiple languages) and explore how they interact with strategies of our retrieval system, including pseudo-relevance feedback, multi-word term down-weighting, and term weight merging strategies. Our analysis shows that the choice of language (English or Japanese) does not have a significant impact on retrieval performance. Query expansion is slightly more effective with relaxed judgments than with rigid judgments. For better retrieval performance, weights of multi-word terms should be lowered. In the bilingual retrieval analysis, we aim to identify robust strategies that are effective when used alone and when used in combination with other strategies. We examine cross-lingual specific strategies such as translation disambiguation and translation structuring, as well as general strategies such as pseudo-relevance feedback and multi-word term down-weighting. For shorter title topics, pseudo-relevance feedback is a major performance enhancer, but translation structuring affects retrieval performance negatively when used alone or in combination with other strategies. All experimented strategies improve retrieval performance for the longer description topics, with pseudo-relevance feedback and translation structuring as the major contributors.
Year
DOI
Venue
2005
10.1145/1105696.1105698
ACM Trans. Asian Lang. Inf. Process.
Keywords
Field
DocType
cross-language information retrieval,cross-language retrieval task,towards effective strategy,comparison,english monolingual retrieval,monolingual information retrieval,bilingual retrieval analysis,ntcir,better retrieval performance,monolingual retrieval analysis,retrieval performance,pseudo-relevance feedback,translation structure,japanese-to-english bilingual retrieval,bilingual retrieval task,bilingual information retrieval,experimental design,information retrieval,query expansion
Human–computer information retrieval,Query expansion,Information retrieval,Computer science,Robustness (computer science),Relevance (information retrieval),Natural language processing,Artificial intelligence,Merge (version control),Structuring,Cross-language information retrieval
Journal
Volume
Issue
Citations 
4
2
1
PageRank 
References 
Authors
0.35
30
11
Name
Order
Citations
PageRank
Yan Qu120.83
David A. Hull21282214.27
Gregory Grefenstette31129147.00
David A. Evans4841147.89
Motoko Ishikawa520.83
Setsuko Nara620.83
Toshiya Ueda720.83
Daisuke Noda820.83
Kousaku Arita920.83
Yuki Funakoshi1020.83
Hiroshi Matsuda1120.83