Title
NUMA-optimized In-memory and Semi-external-memory Parameterized Clustering.
Abstract
K-means is one of the most influential and utilized machine learning algorithms today. Its computation limits the performance and scalability of many statistical analysis and machine learning tasks. We rethink k-means in terms of modern architectures to develop a novel parallelization scheme that delays and minimizes synchronization barriers. We utilize this to develop two modules k$||$means$_T$, and SEM-k$||$means$_T^{RC}$. k$||$means$_T$ is optimized for NUMA architectures and radically improves the performance for datasets that fit in memory. SEM-k$||$means$_T^{RC}$ improves k-meansu0027 scalability on a memory budget using semi-external memory, SSDs and a minimalistic triangle inequality pruning algorithm. SEM-k$||$means$_T^{RC}$ scales to billions of points on a single machine, using a fraction of the resources that distributed in-memory systems require. We show k$||$means$_T$ outperforms commercial products like H$_2$O, Dato and Sparku0027s MLlib by up to an order of magnitude on $O(10^7)$ point datasets. Furthermore, SEM-k$||$means$_T^{RC}$ effortlessly clusters $geq O(10^8)$ point datasets while outperforming distributed in-memory frameworks.
Year
Venue
Field
2016
arXiv: Distributed, Parallel, and Cluster Computing
Synchronization,Parameterized complexity,Spark (mathematics),Computer science,Parallel computing,Real-time computing,Triangle inequality,Cluster analysis,Scalability,Auxiliary memory,Distributed computing,Computation
DocType
Volume
Citations 
Journal
abs/1606.08905
0
PageRank 
References 
Authors
0.34
7
5
Name
Order
Citations
PageRank
Disa Mhembere1635.42
Da Zheng262.49
Joshua T. Vogelstein327331.99
Carey E. Priebe4505108.56
Randal C. Burns584.19