Abstract | ||
---|---|---|
CUDA and OpenCL offer two different interfaces for programming GPUs. OpenCL
is an open standard that can be used to program CPUs, GPUs, and other devices
from different vendors, while CUDA is specific to NVIDIA GPUs. Although OpenCL
promises a portable language for GPU programming, its generality may entail a
performance penalty. In this paper, we compare the performance of CUDA and
OpenCL using complex, near-identical kernels. We show that when using NVIDIA
compiler tools, converting a CUDA kernel to an OpenCL kernel involves minimal
modifications. Making such a kernel compile with ATI's build tools involves
more modifications. Our performance tests measure and compare data transfer
times to and from the GPU, kernel execution times, and end-to-end application
execution times for both CUDA and OpenCL. |
Year | Venue | Keywords |
---|---|---|
2010 | Clinical Orthopaedics and Related Research | cluster computing,gpu programming,open standard,data transfer |
Field | DocType | Volume |
Kernel (linear algebra),Open standard,Data transmission,CUDA,Computer science,Parallel computing,Compiler,General-purpose computing on graphics processing units,Quantum Monte Carlo | Journal | abs/1005.2 |
Citations | PageRank | References |
34 | 2.05 | 4 |
Authors | ||
3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Kamran Karimi | 1 | 118 | 17.23 |
Neil Dickson | 2 | 63 | 6.72 |
Firas Hamze | 3 | 131 | 14.05 |