Abstract | ||
---|---|---|
In this paper, we present the mono-lingual and bilingual ad-hoc information retrieval experimental results at NTCIR-6. This year we compare two different word to- kenization levels for indexing, namely, unigram, and overlapping bigram. The two famous information retrieval models, i.e., language model, and BM-25 were adopted in our study. In the mono-lingual results show that our method achieved the average most runs, while the overlapping bigrams were indexed. The unigram level of words did the almost poor results in all runs. In the bilingual retrieval tasks, we translate the queries through a well- known machine translation tool. The evaluation results of our method were also given in the tail of this paper. |
Year | Venue | Field |
---|---|---|
2007 | NTCIR | Tokenization (data security),Information retrieval,Computer science,Machine translation,Search engine indexing,Speech recognition,Natural language processing,Bigram,Artificial intelligence,Vector space model,Language model,Visual Word |
DocType | Citations | PageRank |
Conference | 1 | 0.35 |
References | Authors | |
17 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Yu-Chieh Wu | 1 | 247 | 23.16 |
Kun-Chang Tsai | 2 | 15 | 1.50 |
Jie-Chi Yang | 3 | 350 | 43.91 |