Title
Domain-Specific IR for German, English and Russian Languages.
Abstract
In participating in this domain-specific track, our first objective is to propose and evaluate a light stemmer for the Russian language. Our second objective is to measure the relative merit of various search engines used for the German and to a lesser extent the English languages. To do so we evaluated the tf ·idf, Okapi, IR models derived from the Divergence from Randomness (DFR) paradigm, and also a language model (LM). For the Russian language, we find that word-based indexing using our light stemming procedure results in better retrieval effectiveness than does the 4-gram indexing strategy (relative difference around 30%). Using the German corpus, we examine certain variations in retrieval effectiveness after applying the specialized thesaurus to automatically enlarge topic descriptions. In this case, the performance variations were relatively small and usually non significant.
Year
DOI
Venue
2007
10.1007/978-3-540-85760-0_26
Advances in Multilingual and Multimodal Information Retrieval
Keywords
DocType
Volume
russian language,language model,english language,light stemmer,domain-specific ir,4-gram indexing strategy,russian languages,better retrieval effectiveness,relative merit,german corpus,retrieval effectiveness,relative difference,indexation,search engine
Conference
5152
ISSN
Citations 
PageRank 
0302-9743
4
0.54
References 
Authors
9
4
Name
Order
Citations
PageRank
Claire Fautsch1517.18
Ljiljana Dolamic212510.84
Samir ABDOU3928.06
Jacques Savoy41601169.85