A Compiler Translate Directive-Based Language to Optimized CUDA - Citegraph

Paper Info

Title
A Compiler Translate Directive-Based Language to Optimized CUDA

Abstract
Graphics processing units(GPUs) provide a low cost platform for accelerating high performance computations. New programming languages, such as CUDA and OpenCL, make GPU programming attractive to programmers. However, programming GPUs is still a cumbersome task for two reasons, tedious performance optimizations and lack of portability. First, optimizing an algorithm for a specific GPU is a time-consuming task that requires a thorough understanding of both the algorithm and the underlying hardware. Unoptimized CUDA programs typically only achieve a small fraction of the peak GPU performance. Second, CUDA programs lack performance portability between different GPUs. Moving code from one GPU to another while maintaining the desired performance is a non-trivial task which often requires significant time. In this paper, we propose an optimized compiler that compiles a representative high level directive-based language to CUDA, which is capable of performing a wide variety of optimizations to generate efficient code for GPUs. We alleviate the portability problem of current GPU programming methods by using a high level directive-based language that provides a unified abstraction for currently popular CPU-GPU heterogeneous systems. Various optimizations, mainly the memory system optimizations, are automatically applied by our compiler to produce optimized CUDA code for GPU. Experiments on rodinia benchmark with different input sizes shows that our compiler achieves 70%, 75%, 84% performance of hand-written code on average respectively.

Year	DOI	Venue
2014	10.1109/HPCC.2014.162	HPCC/CSS/ICESS
Keywords	Field	DocType
programming languages,cuda programs,optimized compiler,hand-written code,gpu,performance optimization,parallel programming,compiler,parallel architectures,graphics processing units,gpu programming,gpu programming methods,portability,parallel languages,performance portability problem,memory system optimizations,high performance computations,high level directive-based language,gpu performance,optimising compilers,directive-based language,opencl,cuda code,rodinia benchmark,hardware,instruction sets,optimization,kernel,parallel processing	Graphics,Instruction set,Computer science,CUDA,Parallel computing,Directive,Compiler,Software portability,General-purpose computing on graphics processing units,CUDA Pinned memory	Conference
ISBN	Citations	PageRank
978-1-4799-6122-1	0	0.34
References	Authors
11	6

Authors (6 rows)

Cited by (0 rows)

References (11 rows)

Name	Order	Citations	PageRank
Li Feng	1	1	2.38
Hong An	2	58	24.15
Weihao Liang	3	3	2.77
Xiaoqiang Li	4	8	1.18
Yichao Cheng	5	3	1.39
Xia Jiang	6	1	0.70

1