Title
Automatic translation of data parallel programs for heterogeneous parallelism through OpenMP offloading
Abstract
Heterogeneous multicores like GPGPUs are now commonplace in modern computing systems. Although heterogeneous multicores offer the potential for high performance, programmers are struggling to program such systems. This paper presents OAO, a compiler-based approach to automatically translate shared-memory OpenMP data-parallel programs to run on heterogeneous multicores through OpenMP offloading directives. Given the large user base of shared memory OpenMP programs, our approach allows programmers to continue using a single-source-based programming language that they are familiar with while benefiting from the heterogeneous performance. OAO introduces a novel runtime optimization scheme to automatically eliminate unnecessary host–device communication to minimize the communication overhead between the host and the accelerator device. We evaluate OAO by applying it to 23 benchmarks from the PolyBench and Rodinia suites on two distinct GPU platforms. Experimental results show that OAO achieves up to 32 $$\times$$ speedup over the original OpenMP version, and can reduce the host–device communication overhead by up to 99% over the hand-translated version.
Year
DOI
Venue
2021
10.1007/s11227-020-03452-2
The Journal of Supercomputing
Keywords
DocType
Volume
Heterogeneous computing, Source-to-source translation, OpenMP offloading, Compilation optimization, GPUs
Journal
77
Issue
ISSN
Citations 
5
0920-8542
0
PageRank 
References 
Authors
0.34
5
6
Name
Order
Citations
PageRank
Farui Wang110.70
Weizhe Zhang228753.07
Haonan Guo300.34
meng hao463.17
Gangzhao Lu531.73
Zheng Wang67910.37