Title
A Compiler Translate Directive-Based Language to Optimized CUDA
Abstract
Graphics processing units(GPUs) provide a low cost platform for accelerating high performance computations. New programming languages, such as CUDA and OpenCL, make GPU programming attractive to programmers. However, programming GPUs is still a cumbersome task for two reasons, tedious performance optimizations and lack of portability. First, optimizing an algorithm for a specific GPU is a time-consuming task that requires a thorough understanding of both the algorithm and the underlying hardware. Unoptimized CUDA programs typically only achieve a small fraction of the peak GPU performance. Second, CUDA programs lack performance portability between different GPUs. Moving code from one GPU to another while maintaining the desired performance is a non-trivial task which often requires significant time. In this paper, we propose an optimized compiler that compiles a representative high level directive-based language to CUDA, which is capable of performing a wide variety of optimizations to generate efficient code for GPUs. We alleviate the portability problem of current GPU programming methods by using a high level directive-based language that provides a unified abstraction for currently popular CPU-GPU heterogeneous systems. Various optimizations, mainly the memory system optimizations, are automatically applied by our compiler to produce optimized CUDA code for GPU. Experiments on rodinia benchmark with different input sizes shows that our compiler achieves 70%, 75%, 84% performance of hand-written code on average respectively.
Year
DOI
Venue
2014
10.1109/HPCC.2014.162
HPCC/CSS/ICESS
Keywords
Field
DocType
programming languages,cuda programs,optimized compiler,hand-written code,gpu,performance optimization,parallel programming,compiler,parallel architectures,graphics processing units,gpu programming,gpu programming methods,portability,parallel languages,performance portability problem,memory system optimizations,high performance computations,high level directive-based language,gpu performance,optimising compilers,directive-based language,opencl,cuda code,rodinia benchmark,hardware,instruction sets,optimization,kernel,parallel processing
Graphics,Instruction set,Computer science,CUDA,Parallel computing,Directive,Compiler,Software portability,General-purpose computing on graphics processing units,CUDA Pinned memory
Conference
ISBN
Citations 
PageRank 
978-1-4799-6122-1
0
0.34
References 
Authors
11
6
Name
Order
Citations
PageRank
Li Feng112.38
Hong An25824.15
Weihao Liang332.77
Xiaoqiang Li481.18
Yichao Cheng531.39
Xia Jiang610.70