Title
A Top-Down Parallel Semisort
Abstract
Semisorting is the problem of reordering an input array of keys such that equal keys are contiguous but different keys are not necessarily in sorted order. Semisorting is important for collecting equal values and is widely used in practice. For example, it is the core of the MapReduce paradigm, is a key component of the database join operation, and has many other applications. We describe a (randomized) parallel algorithm for the problem that is theoretically efficient (linear work and logarithmic depth), but is designed to be more practically efficient than previous algorithms. We use ideas from the parallel integer sorting algorithm of Rajasekaran and Reif, but instead of processing bits of a integers in a reduced range in a bottom-up fashion, we process the hashed values of keys directly top-down. We implement the algorithm and experimentally show on a variety of input distributions that it outperforms a similarly-optimized radix sort on a modern 40-core machine with hyper-threading by about a factor of 1.7--1.9, and achieves a parallel speedup of up to 38x. We discuss the various optimizations used in our implementation and present an extensive experimental analysis of its performance.
Year
DOI
Venue
2015
10.1145/2755573.2755597
ACM Symposium on Parallelism in Algorithms and Architectures
Keywords
Field
DocType
Parallel Algorithms,Semisorting,Integer Sorting
Counting sort,Integer,Parallel algorithm,Computer science,Parallel computing,Radix sort,Top-down and bottom-up design,Algorithm,Integer sorting,Logarithm,Speedup,Distributed computing
Conference
Citations 
PageRank 
References 
5
0.38
16
Authors
4
Name
Order
Citations
PageRank
Yan Gu1435.10
Julian Shun259332.57
Yihan Sun37311.19
Guy E. Blelloch42927207.30