Abstract | ||
---|---|---|
This paper presents a cross-language retrieval system for the retrieval of English documents in response to queries in Bengali and Hindi, as part of our participation in CLEF 2007 Ad-hoc bilingual track. We followed the dictionary-based Machine Translation approach to generate the equivalent English query out of Indian language topics. Our main challenge was to work with a limited coverage dictionary (of coverage ~ 20%) that was available for Hindi-English, and virtually non-existent dictionary for Bengali-English. So we depended mostly on a phonetic transliteration system to overcome this. The CLEF results point to the need for a rich bilingual lexicon, a translation disambiguator, Named Entity Recognizer and a better transliterator for CLIR involving Indian languages. The best MAP values for Bengali and Hindi CLIR for our experiment were 7.26% and 4.77%, which are 20% and 13% of our best monolingual retrieval, respectively. |
Year | DOI | Venue |
---|---|---|
2008 | 10.1007/978-3-540-85760-0_12 | CLEF |
Keywords | DocType | Volume |
hindi clir,equivalent english query,best monolingual retrieval,english document,limited coverage dictionary,clef result,cross-language retrieval system,english clir evaluation,indian language topic,ad-hoc bilingual track,indian language,machine translation | Conference | 5152 |
ISSN | Citations | PageRank |
0302-9743 | 2 | 0.46 |
References | Authors | |
19 | 5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Debasis Mandal | 1 | 4 | 1.54 |
Mayank Gupta | 2 | 118 | 10.60 |
Sandipan Dandapat | 3 | 73 | 15.17 |
Pratyush Banerjee | 4 | 52 | 6.57 |
Sudeshna Sarkar | 5 | 423 | 210.58 |