Title
Efficient stream compaction on wide SIMD many-core architectures
Abstract
Stream compaction is a common parallel primitive used to remove unwanted elements in sparse data. This allows highly parallel algorithms to maintain performance over several processing steps and reduces overall memory usage. For wide SIMD many-core architectures, we present a novel stream compaction algorithm and explore several variations thereof. Our algorithm is designed to maximize concurrent execution, with minimal use of synchronization. Bandwidth and auxiliary storage requirements are reduced significantly, which allows for substantially better performance. We have tested our algorithms using CUDA on a PC with an NVIDIA GeForce GTX280 GPU. On this hardware, our reference implementation provides a 3x speedup over previous published algorithms.
Year
DOI
Venue
2009
10.1145/1572769.1572795
High Performance Graphics
Keywords
Field
DocType
minimal use,parallel algorithm,previous published algorithm,auxiliary storage requirement,stream compaction,concurrent execution,better performance,wide simd many-core architecture,efficient stream compaction,overall memory usage,nvidia geforce gtx280 gpu,novel stream compaction algorithm,sparse data,gpgpu,prefix sum
Prefix sum,Parallel algorithm,Computer science,CUDA,Parallel computing,SIMD,General-purpose computing on graphics processing units,Sparse matrix,Auxiliary memory,Speedup
Conference
Citations 
PageRank 
References 
48
2.46
13
Authors
3
Name
Order
Citations
PageRank
Markus Billeter115313.30
Ola Olsson211311.67
Ulf Assarsson362142.84