Abstract | ||
---|---|---|
k Nearest Neighbor (kNN) is a widely used classifier in time series data analytics due to its interpretability. kNN is often referred to as a lazy learning algorithm as it does not learn any discriminative function nor does it generate any rules from the training data. Instead, kNN classifier requires a search over all the training data for classifying a single test sample which makes it computationally demanding and hard to adopt for real world application. These applications are, sometimes, time-critical such as solar flare prediction which might have irreversible impacts on Earth. Therefore, scaling the nearest neighbors search to large datasets is crucial. In this paper, we propose a new scalable methodology to mitigate the problem of kNN high computational cost by approximating the nearest neighbor(s) with the help of clustering as a preprocessing step. We tested our idea on a comprehensive set of datasets with varying sizes and labels. Our results show that the performance of our approximate technique is comparable to the exact kNN classifier with up to 10x speed-up. |
Year | DOI | Venue |
---|---|---|
2018 | 10.1109/ICPR.2018.8546103 | 2018 24TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR) |
Keywords | Field | DocType |
Univariate Time Series classification, Scalable Nearest Neighbor Search, Density-Based Clustering | k-nearest neighbors algorithm,Interpretability,Pattern recognition,Computer science,Lazy learning,Preprocessor,Artificial intelligence,Classifier (linguistics),Cluster analysis,Discriminative model,Scalability | Conference |
ISSN | Citations | PageRank |
1051-4651 | 0 | 0.34 |
References | Authors | |
0 | 5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Soukaina Filali Boubrahimi | 1 | 1 | 6.10 |
Ruizhe Ma | 2 | 17 | 4.69 |
Berkay Aydin | 3 | 40 | 10.75 |
Shah Muhammad Hamdi | 4 | 3 | 2.76 |
Rafal A. Angryk | 5 | 271 | 45.56 |