Title
The Impact of Address Arithmetic on the GPU Implementation of Fast Algorithms for the Vilenkin-Chrestenson Transform
Abstract
This paper considers the impact of address arithmetic in the Cooley-Tukey and the constant geometry fast algorithms for the Vilenkin-Chrestenson transform on their implementation for the graphics processing unit (GPU). We consider issues such as using different transform radices and analyze the number of GPU instructions and register usage in the OpenCL implementations of the considered algorithms. Further, we compare the program running times on the GPU and on the central processing unit (CPU). Experiments show that the GPU implementations are from 10 to 22 times faster than the C/C++ CPU implementations, depending on the transform radix and the number of variables in the processed function. The OpenCL implementation of the constant geometry algorithm translates into a lower number of GPU arithmetic and fetch instructions and uses less registers. This implementation requires up to 21% shorter processing times than the corresponding Cooley-Tukey algorithm implementation.
Year
DOI
Venue
2013
10.1109/ISMVL.2013.59
ISMVL
Keywords
Field
DocType
gpu instruction,address arithmetic,gpu implementation,cpu implementation,opencl implementation,fast algorithms,gpu arithmetic,lower number,corresponding cooley-tukey algorithm implementation,vilenkin-chrestenson transform,shorter processing time,central processing unit,gpu computing,parallel algorithms,cpu,computer architecture,instruction sets,kernel,geometry,matrix decomposition,public domain software,registers
Kernel (linear algebra),Central processing unit,Parallel algorithm,Computer science,Instruction set,Parallel computing,Matrix decomposition,Algorithm,Arithmetic,Implementation,General-purpose computing on graphics processing units,Graphics processing unit
Conference
ISSN
Citations 
PageRank 
0195-623X
1
0.41
References 
Authors
3
2
Name
Order
Citations
PageRank
Dusan B. Gajic143.93
Radomir S. Stankovic218847.07