Title
A Search Engine for Morphologically Complex Languages
Abstract
Document retrieval on natural languages with a rich morphology -- particularly in terms of derivation and (single-word) composition -- suffers from serious performance degradation with the direct query-term-to-text-word matching paradigm that underlies the vast majority of current search engines. We propose an alternative approach in which morphologically complex word forms, which appear in the query as well as in the documents, are segmented into relevant subwords (such as stems, named entities, acronyms) and are subsequently submitted to the matching procedure. We evaluate our approach with the Alta Vista驴 Search Engine on a large medical document collection.
Year
DOI
Venue
2001
10.1007/3-540-44816-0_8
IDA
Keywords
Field
DocType
complex word form,search engine,direct query-term-to-text-word,alta vista,large medical document collection,current search engine,matching procedure,alternative approach,natural language,document retrieval,morphologically complex languages
Search engine,Computer science,Natural language,Natural language processing,Artificial intelligence,Document retrieval
Conference
ISBN
Citations 
PageRank 
3-540-42581-0
0
0.34
References 
Authors
10
3
Name
Order
Citations
PageRank
Udo Hahn193788.14
Martin Honeck2212.62
Stefan Schulz361.38