Abstract | ||
---|---|---|
Nearest neighbor search is a cornerstone problem in computational geometry, non-parametric statistics, and machine learning. For N points, exhaustive search requires quadratic work, but many fast algorithms reduce the complexity for exact and approximate searches. The common kernel (kNN kernel) in all these algorithms solves many small-size problems exactly using exhaustive search. We propose an efficient implementation and performance analysis for the kNN kernel on x86 architectures. By fusing the distance calculation with the neighbor selection, we are able to utilize memory throughput. We present an analysis of the algorithm and explain parameter selection. We perform an experimental study varying the size of the problem, the dimension of the dataset, and the number of nearest neighbors. Overall we observe significant speedups. For example, when searching for 16 neighbors in a point dataset with 1.6 million points in 64 dimensions, our kernel is over 4 times faster than existing methods. |
Year | DOI | Venue |
---|---|---|
2015 | 10.1145/2807591.2807601 | International Conference for High Performance Computing, Networking, Storage, and Analysis |
Keywords | Field | DocType |
High dimensional data analysis, machine learning, nearest-neighbor problems, high-performance computing, parallel algorithms, data mining | k-nearest neighbors algorithm,Fixed-radius near neighbors,Radial basis function kernel,Kernel embedding of distributions,Computer science,Tree kernel,Theoretical computer science,Polynomial kernel,Kernel method,Nearest neighbor search | Conference |
ISBN | Citations | PageRank |
978-1-5090-0273-3 | 10 | 0.52 |
References | Authors | |
17 | 5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Chenhan D. Yu | 1 | 62 | 6.25 |
Jianyu Huang | 2 | 15 | 2.34 |
Woody Austin | 3 | 10 | 0.52 |
Bo Xiao | 4 | 52 | 4.06 |
George Biros | 5 | 938 | 77.86 |