Title
Impact Of Instance Selection On Knn-Based Text Categorization
Abstract
With the increasing use of the Internet and electronic documents, automatic text categorization becomes imperative. Several machine learning algorithms have been proposed for text categorization. The k-nearest neighbor algorithm (kNN) is known to be one of the best state of the art classifiers when used for text categorization. However, kNN suffers from limitations such as high computation when classifying new instances. Instance selection techniques have emerged as highly competitive methods to improve kNN through data reduction. However previous works have evaluated those approaches only on structured datasets. In addition, their performance has not been examined over the text categorization domain where the dimensionality and size of the dataset is very high. Motivated by these observations, this paper investigates and analyzes the impact of instance selection on kNN-based text categorization in terms of various aspects such as classification accuracy, classification efficiency, and data reduction.
Year
DOI
Venue
2018
10.3745/JIPS.02.0080
JOURNAL OF INFORMATION PROCESSING SYSTEMS
Keywords
DocType
Volume
Classification Accuracy, Classification Efficiency, Data Reduction, Instance Selection, k-Nearest Neighbors, Text Categorization
Journal
14
Issue
ISSN
Citations 
2
1976-913X
0
PageRank 
References 
Authors
0.34
0
1
Name
Order
Citations
PageRank
Fatiha Barigou1146.76