Title
GPU Performance Enhancement via Communication Cost Reduction: Case Studies of Radix Sort and WSN Relay Node Placement Problem
Abstract
As the computational power of Graphics Processing Unit (GPU) increases, data transmission becomes the major performance bottleneck. In this study, we investigate two techniques, data streaming and data compression, to reduce the communication cost on GPU. Data streaming enables overlap of communication and computation, whereas data compression reduces the data size transferred among different memory spaces. Although both techniques increase computation cost, overall performance can still be enhanced by reducing communication cost. We demonstrate the effectiveness of the two techniques via two case studies: radix sort and 3-star, a deployment algorithm in wireless sensor networks. For radix sort, a new algorithm, which mixes MSD and LSD algorithms and employs data streaming, is presented. Its performance is 25% faster than the fastest GPU radix sort implementation currently available in the public domain. For the 3-star algorithm, the speed increases several hundreds of times faster than that obtained by the CPU code. The data streaming and data compression, which is a hybrid CPU-GPU algorithm, provide an additional 54% performance improvement to the GPU implementation. Data compression not only reduces communication cost, but also improves the computation time, by which further performance enhancement can be achieved.
Year
DOI
Venue
2012
10.1109/CCGrid.2012.16
CCGrid
Keywords
Field
DocType
gpu implementation,gpu performance enhancement,case studies,communication cost reduction,wsn relay node placement,overall performance,data transmission,radix sort,performance improvement,communication cost,major performance bottleneck,performance enhancement,3-star algorithm,data compression,public domain,approximation algorithms,instruction sets,kernel,wireless sensor network,wireless sensor networks
Bottleneck,Central processing unit,Data transmission,Computer science,Parallel computing,Radix sort,Graphics processing unit,Data compression,Cost reduction,Distributed computing,Performance improvement
Conference
Citations 
PageRank 
References 
2
0.39
17
Authors
5
Name
Order
Citations
PageRank
Che-Rung Lee17813.52
Shih-hsiang Lo2376.15
Nan-Hsi Chen320.39
Yeh-Ching Chung498397.16
I-hsin Chung538832.41