Title | ||
---|---|---|
Fast Parallel Algorithms for Euclidean Minimum Spanning Tree and Hierarchical Spatial Clustering |
Abstract | ||
---|---|---|
ABSTRACTThis paper presents new parallel algorithms for generating Euclidean minimum spanning trees and spatial clustering hierarchies (known as HDBSCAN*). Our approach is based on generating a well-separated pair decomposition followed by using Kruskal's minimum spanning tree algorithm and bichromatic closest pair computations. We introduce a new notion of well-separation to reduce the work and space of our algorithm for HDBSCAN*. We also give a new parallel divide-and-conquer algorithm for computing the dendrogram and reachability plots, which are used in visualizing clusters of different scale that arise for both EMST and HDBSCAN*. We show that our algorithms are theoretically efficient: they have work (number of operations) matching their sequential counterparts, and polylogarithmic depth (parallel time). We implement our algorithms and propose a memory optimization that requires only a subset of well-separated pairs to be computed and materialized, leading to savings in both space (up to 10x) and time (up to 8x). Our experiments on large real-world and synthetic data sets using a 48-core machine show that our fastest algorithms outperform the best serial algorithms for the problems by 11.13--55.89x, and existing parallel algorithms by at least an order of magnitude. |
Year | DOI | Venue |
---|---|---|
2021 | 10.1145/3448016.3457296 | International Conference on Management of Data |
DocType | ISSN | Citations |
Conference | 0730-8078 | 3 |
PageRank | References | Authors |
0.38 | 0 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Wang, Yiqiu | 1 | 3 | 1.39 |
Shangdi Yu | 2 | 3 | 2.07 |
Yan Gu | 3 | 57 | 10.46 |
Julian Shun | 4 | 593 | 32.57 |