Base64 Encoding on Heterogeneous Computing Platforms - Citegraph

Paper Info

Title
Base64 Encoding on Heterogeneous Computing Platforms

Abstract
Base64 encoding has many applications on the Web. Previous studies investigated the optimizations of Base64 encoding algorithm on central processing units (CPUs). In this paper, we describe the optimizations of the algorithm on heterogeneous computing platforms. More specifically, we explain the algorithm, convert the algorithm to kernels written in CUDA C/C++ and Open Computing Language (OpenCL), optimize the CUDA and OpenCL applications with CUDA and OpenCL streams which can overlap data transfers with kernel computations, and vectorize the CUDA and OpenCL kernels to improve kernel throughput. We evaluate the impact of the number of streams upon the kernel performance on an NVIDIA Pascal P100 graphics processing unit (GPU) and a Nallatech 385A card that features an Intel Arria 10 GX1150 field-programmable gate array (FPGA). We also measure the performance and power of the applications on the CPU, GPU, and FPGA to know the advantage of each platform and the benefit of kernel offloading. The experiments show that using vector data types in the kernels is not for performance, and more work-items is better than large vectors per work-item on the GPU. OpenCL and CUDA streams can achieve almost the same performance on the GPU, but streams should be used with caution when GPU resources are underutilized. On the FPGA, kernel vectorization using 16 vector lanes can achieve the highest performance when the number of streams is one. However, increasing the vector width per work-item and the number of streams can decrease the kernel computation time for each stream, and thereby reduce the number of concurrent operations across the streams. While the raw performance on the GPU is 3.1X higher than that on the FPGA, the FPGA consumes 3.4X less power. A comparison with a state-of-the-art implementation on an Intel CPU server shows an increasing benefit of kernel offloading.

Year	DOI	Venue
2019	10.1109/ASAP.2019.00014	2019 IEEE 30th International Conference on Application-specific Systems, Architectures and Processors (ASAP)
Keywords	Field	DocType
Heterogeneous computing, GPU, FPGA, Base64 encoding, CUDA, OpenCL, Stream	Kernel (linear algebra),Central processing unit,CUDA,Computer science,Parallel computing,Field-programmable gate array,Vectorization (mathematics),Symmetric multiprocessor system,Image tracing,Graphics processing unit	Conference
Volume	ISSN	ISBN
2160-052X	2160-0511	978-1-7281-1602-0
Citations	PageRank	References
0	0.34	12
Authors
2

Authors (2 rows)

Cited by (0 rows)

References (12 rows)

Name	Order	Citations	PageRank
Zheming Jin	1	17	11.95
Hal Finkel	2	6	3.21

1