Title | ||
---|---|---|
GPU Performance Enhancement via Communication Cost Reduction: Case Studies of Radix Sort and WSN Relay Node Placement Problem |
Abstract | ||
---|---|---|
As the computational power of Graphics Processing Unit (GPU) increases, data transmission becomes the major performance bottleneck. In this study, we investigate two techniques, data streaming and data compression, to reduce the communication cost on GPU. Data streaming enables overlap of communication and computation, whereas data compression reduces the data size transferred among different memory spaces. Although both techniques increase computation cost, overall performance can still be enhanced by reducing communication cost. We demonstrate the effectiveness of the two techniques via two case studies: radix sort and 3-star, a deployment algorithm in wireless sensor networks. For radix sort, a new algorithm, which mixes MSD and LSD algorithms and employs data streaming, is presented. Its performance is 25% faster than the fastest GPU radix sort implementation currently available in the public domain. For the 3-star algorithm, the speed increases several hundreds of times faster than that obtained by the CPU code. The data streaming and data compression, which is a hybrid CPU-GPU algorithm, provide an additional 54% performance improvement to the GPU implementation. Data compression not only reduces communication cost, but also improves the computation time, by which further performance enhancement can be achieved. |
Year | DOI | Venue |
---|---|---|
2012 | 10.1109/CCGrid.2012.16 | CCGrid |
Keywords | Field | DocType |
gpu implementation,gpu performance enhancement,case studies,communication cost reduction,wsn relay node placement,overall performance,data transmission,radix sort,performance improvement,communication cost,major performance bottleneck,performance enhancement,3-star algorithm,data compression,public domain,approximation algorithms,instruction sets,kernel,wireless sensor network,wireless sensor networks | Bottleneck,Central processing unit,Data transmission,Computer science,Parallel computing,Radix sort,Graphics processing unit,Data compression,Cost reduction,Distributed computing,Performance improvement | Conference |
Citations | PageRank | References |
2 | 0.39 | 17 |
Authors | ||
5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Che-Rung Lee | 1 | 78 | 13.52 |
Shih-hsiang Lo | 2 | 37 | 6.15 |
Nan-Hsi Chen | 3 | 2 | 0.39 |
Yeh-Ching Chung | 4 | 983 | 97.16 |
I-hsin Chung | 5 | 388 | 32.41 |