Title
Cross-language retrieval using HAIRCUT at CLEF 2004
Abstract
JHU/APL continued to explore the use of knowledge-light methods for multilingual retrieval during the CLEF 2004 evaluation. We relied on the language-neutral techniques of character n-gram tokenization, pre-translation query expansion, statistical translation using aligned parallel corpora, fusion from disparate retrievals, and reliance on language similarity when resources are scarce. We participated in the monolingual and bilingual evaluations. Our results support the claims that n-gram based retrieval is highly effective; that fusion of multiple retrievals is helpful in bilingual retrieval; and, that reliance on language similarity in lieu of translation can outperform a high performing system using abundant translation resources and a less similar query language.
Year
DOI
Venue
2004
10.1007/11519645_5
CLEF (Working Notes)
Keywords
Field
DocType
similar query language,multilingual retrieval,cross-language retrieval,statistical translation,disparate retrieval,abundant translation resource,bilingual evaluation,multiple retrieval,character n-gram tokenization,bilingual retrieval,language similarity,query language,query expansion
Tokenization (data security),Similitude,Query language,Query expansion,Information retrieval,Multilingualism,Computer science,Information access,Artificial intelligence,Natural language processing,Parsing,Clef
Conference
Volume
ISSN
ISBN
3491
0302-9743
3-540-27420-0
Citations 
PageRank 
References 
1
0.37
10
Authors
2
Name
Order
Citations
PageRank
Paul McNamee1295.30
James Mayfield263173.17