Abstract | ||
---|---|---|
Stream compaction is a common parallel primitive used to remove unwanted elements in sparse data. This allows highly parallel algorithms to maintain performance over several processing steps and reduces overall memory usage. For wide SIMD many-core architectures, we present a novel stream compaction algorithm and explore several variations thereof. Our algorithm is designed to maximize concurrent execution, with minimal use of synchronization. Bandwidth and auxiliary storage requirements are reduced significantly, which allows for substantially better performance. We have tested our algorithms using CUDA on a PC with an NVIDIA GeForce GTX280 GPU. On this hardware, our reference implementation provides a 3x speedup over previous published algorithms. |
Year | DOI | Venue |
---|---|---|
2009 | 10.1145/1572769.1572795 | High Performance Graphics |
Keywords | Field | DocType |
minimal use,parallel algorithm,previous published algorithm,auxiliary storage requirement,stream compaction,concurrent execution,better performance,wide simd many-core architecture,efficient stream compaction,overall memory usage,nvidia geforce gtx280 gpu,novel stream compaction algorithm,sparse data,gpgpu,prefix sum | Prefix sum,Parallel algorithm,Computer science,CUDA,Parallel computing,SIMD,General-purpose computing on graphics processing units,Sparse matrix,Auxiliary memory,Speedup | Conference |
Citations | PageRank | References |
48 | 2.46 | 13 |
Authors | ||
3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Markus Billeter | 1 | 153 | 13.30 |
Ola Olsson | 2 | 113 | 11.67 |
Ulf Assarsson | 3 | 621 | 42.84 |