Title
clusterNOR: A NUMA-Optimized Clustering Framework.
Abstract
Clustering algorithms are iterative and have complex data access patterns that result in many small random memory accesses. The performance of parallel implementations suffer from synchronous barriers for each iteration and skewed workloads. We rethink the parallelization of clustering for modern non-uniform memory architectures (NUMA) to maximizes independent, asynchronous computation. We eliminate many barriers, reduce remote memory accesses, and maximize cache reuse. We implement the u0027Clustering NUMA Optimized Routinesu0027 (clusterNOR) extensible parallel framework that provides algorithmic building blocks. The system is generic, we demonstrate nine modern clustering algorithms that have simple implementations. clusterNOR includes (i) in-memory, (ii) semi-external memory, and (iii) distributed memory execution, enabling computation for varying memory and hardware budgets. For algorithms that rely on Euclidean distance, clusterNOR defines an updated Elkanu0027s triangle inequality pruning algorithm that uses asymptotically less memory so that it works on billion-point data sets. clusterNOR extends and expands the scope of the u0027knoru0027 library for k-means clustering by generalizing underlying principles, providing a uniform programming interface and expanding the scope to hierarchical and linear algebraic classes of algorithms. The compound effect of our optimizations is an order of magnitude improvement in speed over other state-of-the-art solutions, such as Sparku0027s MLlib and Appleu0027s Turi.
Year
Venue
DocType
2019
arXiv: Distributed, Parallel, and Cluster Computing
Journal
Volume
Citations 
PageRank 
abs/1902.09527
0
0.34
References 
Authors
24
5
Name
Order
Citations
PageRank
Disa Mhembere1635.42
Da Zheng262.49
Carey E. Priebe3505108.56
Joshua T. Vogelstein427331.99
Randal Burns51955115.15