Sponge: portable stream programming on graphics engines - Citegraph

Paper Info

Title
Sponge: portable stream programming on graphics engines

Abstract
Graphics processing units (GPUs) provide a low cost platform for accelerating high performance computations. The introduction of new programming languages, such as CUDA and OpenCL, makes GPU programming attractive to a wide variety of programmers. However, programming GPUs is still a cumbersome task for two primary reasons: tedious performance optimizations and lack of portability. First, optimizing an algorithm for a specific GPU is a time-consuming task that requires a thorough understanding of both the algorithm and the underlying hardware. Unoptimized CUDA programs typically only achieve a small fraction of the peak GPU performance. Second, GPU code lacks efficient portability as code written for one GPU can be inefficient when executed on another. Moving code from one GPU to another while maintaining the desired performance is a non-trivial task often requiring significant modifications to account for the hardware differences. In this work, we propose Sponge, a compilation framework for GPUs using synchronous data flow streaming languages. Sponge is capable of performing a wide variety of optimizations to generate efficient code for graphics engines. Sponge alleviates the problems associated with current GPU programming methods by providing portability across different generations of GPUs and CPUs, and a better abstraction of the hardware details, such as the memory hierarchy and threading model. Using streaming, we provide a write-once software paradigm and rely on the compiler to automatically create optimized CUDA code for a wide variety of GPU targets. Sponge's compiler optimizations improve the performance of the baseline CUDA implementations by an average of 3.2x.

Year	DOI	Venue
2011	10.1145/1950365.1950409	ASPLOS
Keywords	Field	DocType
peak gpu performance,wide variety,specific gpu,cuda code,graphics engine,gpu code,high performance computation,gpu target,gpu programming,portable stream programming,current gpu programming method,efficient code,compiler optimization,optimization,compiler,portability,programming language	Programming language,Memory hierarchy,CUDA,Computer science,Parallel computing,Compiler,Real-time computing,Optimizing compiler,Software portability,General-purpose computing on graphics processing units,Synchronous Data Flow,CUDA Pinned memory	Conference
Volume	Issue	ISSN
39	1	0163-5964
Citations	PageRank	References
58	2.12	20
Authors
5

Authors (5 rows)

Cited by (58 rows)

References (20 rows)

Name	Order	Citations	PageRank
Amir Hormati	1	418	19.11
Mehrzad Samadi	2	422	16.09
Mark Woh	3	432	28.18
Trevor Mudge	4	6139	659.74
Scott Mahlke	5	4811	312.08

1