Title
Modeling GPU-CPU workloads and systems
Abstract
Heterogeneous systems, systems with multiple processors tailored for specialized tasks, are challenging programming environments. While it may be possible for domain experts to optimize a high performance application for a very specific and well documented system, it may not perform as well or even function on a different system. Developers who have less experience with either the application domain or the system architecture may devote a significant effort to writing a program that merely functions correctly. We believe that a comprehensive analysis and modeling frame-work is necessary to ease application development and automate program optimization on heterogeneous platforms. This paper reports on an empirical evaluation of 25 CUDA applications on four GPUs and three CPUs, leveraging the Ocelot dynamic compiler infrastructure which can execute and instrument the same CUDA applications on either target. Using a combination of instrumentation and statistical analysis, we record 37 different metrics for each application and use them to derive relationships between program behavior and performance on heterogeneous processors. These relationships are then fed into a modeling frame-work that attempts to predict the performance of similar classes of applications on different processors. Most significantly, this study identifies several non-intuitive relationships between program characteristics and demonstrates that it is possible to accurately model CUDA kernel performance using only metrics that are available before a kernel is executed.
Year
DOI
Venue
2010
10.1145/1735688.1735696
GPGPU
Keywords
Field
DocType
different metrics,application domain,program behavior,cuda application,model cuda kernel performance,application development,different processor,modeling gpu-cpu workloads,automate program optimization,high performance application,program characteristic,system architecture,program optimization,gpgpu,dynamic compilation
Program optimization,Kernel (linear algebra),CUDA,Program behavior,Computer science,Parallel computing,Compiler,Application domain,General-purpose computing on graphics processing units,Systems architecture,Distributed computing
Conference
Citations 
PageRank 
References 
44
2.20
5
Authors
3
Name
Order
Citations
PageRank
Andrew Kerr11225.46
Gregory Frederick Diamos2111751.07
Sudhakar Yalamanchili31836184.95