Title
Sponge: portable stream programming on graphics engines
Abstract
Graphics processing units (GPUs) provide a low cost platform for accelerating high performance computations. The introduction of new programming languages, such as CUDA and OpenCL, makes GPU programming attractive to a wide variety of programmers. However, programming GPUs is still a cumbersome task for two primary reasons: tedious performance optimizations and lack of portability. First, optimizing an algorithm for a specific GPU is a time-consuming task that requires a thorough understanding of both the algorithm and the underlying hardware. Unoptimized CUDA programs typically only achieve a small fraction of the peak GPU performance. Second, GPU code lacks efficient portability as code written for one GPU can be inefficient when executed on another. Moving code from one GPU to another while maintaining the desired performance is a non-trivial task often requiring significant modifications to account for the hardware differences. In this work, we propose Sponge, a compilation framework for GPUs using synchronous data flow streaming languages. Sponge is capable of performing a wide variety of optimizations to generate efficient code for graphics engines. Sponge alleviates the problems associated with current GPU programming methods by providing portability across different generations of GPUs and CPUs, and a better abstraction of the hardware details, such as the memory hierarchy and threading model. Using streaming, we provide a write-once software paradigm and rely on the compiler to automatically create optimized CUDA code for a wide variety of GPU targets. Sponge's compiler optimizations improve the performance of the baseline CUDA implementations by an average of 3.2x.
Year
DOI
Venue
2011
10.1145/1950365.1950409
ASPLOS
Keywords
Field
DocType
peak gpu performance,wide variety,specific gpu,cuda code,graphics engine,gpu code,high performance computation,gpu target,gpu programming,portable stream programming,current gpu programming method,efficient code,compiler optimization,optimization,compiler,portability,programming language
Programming language,Memory hierarchy,CUDA,Computer science,Parallel computing,Compiler,Real-time computing,Optimizing compiler,Software portability,General-purpose computing on graphics processing units,Synchronous Data Flow,CUDA Pinned memory
Conference
Volume
Issue
ISSN
39
1
0163-5964
Citations 
PageRank 
References 
58
2.12
20
Authors
5
Name
Order
Citations
PageRank
Amir Hormati141819.11
Mehrzad Samadi242216.09
Mark Woh343228.18
Trevor Mudge46139659.74
Scott Mahlke54811312.08