Title
A Method to Generate a Reduced Training Set for Faster and Better Nearest Neighbor Classification.
Abstract
Classification time and space requirements of nearest neighbor based classifiers depends directly on the training set size. There exist several ways to reduce the training set size, and also there exist some methods to generate artificial training sets which are aimed at achieving better classification accuracy. These are often called bootstrap methods. The paper proposes a method which tries to achieve both of these objectives, namely, improving the performance by generating a bootstrapped training set and reducing the training set size by eliminating some irrelevant training patterns. The proposed method is a faster one than similar recent methods and runs in a linear time of the training set size. The method first will find a clustering in a class of training patterns using the c-means clustering method to derive the c mean patterns, then for each pattern, a new pattern is derived by taking a weighted combination of the pattern with its mean. This smooths the boundary between classes in the feature space, hence can act as a regularization step. Along with this, a threshold distance is set and all patterns that fall within this distance from a mean pattern are removed from the training set. Since these are mostly the interior patterns, their removal will not affect the boundary between the classes. Experimentally the proposed method is compared against recent relevant methods and are shown to be effective and faster than them. The proposed method is a suitable one to work with large data sets like those in data mining.
Year
Venue
Keywords
2010
KDIR 2010: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND INFORMATION RETRIEVAL
Clustering,Classification,Bootstrapped training set,Artificial training set,Nearest neighbor classifier
Field
DocType
Citations 
k-nearest neighbors algorithm,Training set,Pattern recognition,Computer science,Best bin first,Artificial intelligence,Nearest-neighbor chain algorithm,Large margin nearest neighbor
Conference
0
PageRank 
References 
Authors
0.34
1
3
Name
Order
Citations
PageRank
P. Viswanath114811.77
V. Suresh Babu2384.00
T. Naveen Kumar300.34