Title
A Survey Of Stemming Algorithms In Information Retrieval
Abstract
Background. During the last fifty years, improved information retrieval techniques have become necessary because of the huge amount of information people have available, which continues to increase rapidly due to the use of new technologies and the Internet. Stemming is one of the processes that can improve information retrieval in terms of accuracy and performance.Aim. This paper provides a detailed assessment of the current status of the stemming process framed in an information retrieval application field by tracing its historical evolution.Method. Papers presenting the first approaches for stemming were reviewed to extract their main features, benefits and drawbacks. Additionally, papers dealing with stemmers for non-English languages or with some more recent proposals were also consulted and compiled. Finally, experimental papers defining the most well-known methods and metrics aimed at evaluating and classifying stemmers were also taken into account to expose their contributions and results.Results. Even if not all researchers agree on the benefits and drawbacks of using stemming in an information retrieval process in general terms, many of them agree on its benefits in specific contexts, such as when the language is highly inflective, when documents are short or when there is limited space for storing data. Some researchers also state that the nature of the documents can influence the performance and the accuracy of the stemmer.Conclusions. Despite many researchers having investigated this field over many years, there are still some open questions, such as how to evaluate a stemmer independently of the information retrieval process, or how much a stemmer improves an information retrieval application in terms of speed. As a summary, some guidelines are also provided to help readers to determine which is the best stemmer for their needs and the tasks they have to carry out.
Year
Venue
Field
2014
INFORMATION RESEARCH-AN INTERNATIONAL ELECTRONIC JOURNAL
Cognitive models of information retrieval,Human–computer information retrieval,Information retrieval,Computer science,Emerging technologies,Relevance (information retrieval),Tracing
DocType
Volume
Issue
Journal
19
1
ISSN
Citations 
PageRank 
1368-1613
4
0.43
References 
Authors
22
4
Name
Order
Citations
PageRank
Cristian Moral174.88
Angélica de Antonio216127.23
Ricardo Imbert3538.60
Jaime Ramírez411416.36