Implementing And Evaluating Opencl On An Armv8 Multi-Core Cpu - Citegraph

Paper Info

Title
Implementing And Evaluating Opencl On An Armv8 Multi-Core Cpu

Abstract
The OpenCL standard allows targeting a large variety of CPU, GPU and accelerator architectures using a single unified programming interface and language. But guaranteeing portability relies heavily on platform-specific implementations. In this paper, we provide an OpenCL implementation on an ARMv8 multi-core CPU, which efficiently maps the generic OpenCL platform model to the ARMv8 multi-core architecture. With this implementation, we first characterize the maximum achieved arithmetic throughput and memory accessing bandwidth on the architecture, and measure the OpenCL-related overheads. Our results demonstrate that there exists an optimization room for improving OpenCL kernel performance. Then, we compare the performance of OpenCL against serial codes and OpenMP codes with 11 benchmarks. The experimental results show that (1) the OpenCL implementation can achieve an average speedup of 6x compared to its OpenMP counterpart, and (2) the GPU-specified OpenCL codes are often unsuitable for this ARMv8 multi-core CPU.

Year	DOI	Venue
2017	10.1109/ISPA/IUCC.2017.00131	2017 15TH IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED PROCESSING WITH APPLICATIONS AND 2017 16TH IEEE INTERNATIONAL CONFERENCE ON UBIQUITOUS COMPUTING AND COMMUNICATIONS (ISPA/IUCC 2017)
Keywords	Field	DocType
OpenCL, FT-1500A, performance, programming	Kernel (linear algebra),Computer science,Parallel computing,Implementation,Bandwidth (signal processing),Human–computer interaction,Software portability,Throughput,Multi-core processor,Speedup	Conference
ISSN	Citations	PageRank
2158-9178	1	0.35
References	Authors
0	5

Authors (5 rows)

Cited by (1 rows)

References (0 rows)

Name	Order	Citations	PageRank
Jianbin Fang	1	265	25.31
Peng Zhang	2	48	5.09
Tao Tang	3	42	7.44
Chun Huang	4	13	8.00
Canqun Yang	5	188	29.39

1