Title
Kurdish stemmer pre-processing steps for improving information retrieval
Abstract
AbstractThe rapid increase in the quantity of Kurdish documents over the last several years has created a need for improving information accuracy and precision in text classification and retrieval. Language stemming is an imperative pre-processing step for increasing the possibility of matching terms in a document in text classification tasks. Stemming helps reduce the total number of searchable terms within a document or query. This article proposes an active approach for stemming Kurdish Sorani texts to reduce variations of words to single terms or stems. The outcomes of the process, described in this article, demonstrate that decreasing the dimensionality of feature vectors in documents will increase the effectiveness of retrieval when the stemming process is used. This process applied for Kurdish Sorani can be adapted and applied in Kurdish Kurmanji as well for greater efficiency and effectiveness in digital text classification and applications.
Year
DOI
Venue
2018
10.1177/0165551516683617
Periodicals
Keywords
Field
DocType
Kurdish stemming,list of Kurdish stop words,stemming approaches
Feature vector,Information retrieval,Computer science,Curse of dimensionality,Natural language processing,Artificial intelligence
Journal
Volume
Issue
ISSN
44
1
0165-5515
Citations 
PageRank 
References 
1
0.34
10
Authors
2
Name
Order
Citations
PageRank
Arazo M. Mustafa110.68
Tarik Rashid2199.27