Title
OpenACC vs the Native Programming on Sunway TaihuLight: A Case Study with GTC-P
Abstract
Sunway TaihuLight is China's recent top-ranked supercomputer worldwide that was the first to be built entirely with home-grown processors. This supercomputer can be programmed with two approaches: directive-based OpenACC and native programming. These approaches are studied here using GTC-P, a particle-in-cell code for investigating micro-turbulence in magnetic fusion plasmas. We have compared the performance and programming efforts between the OpenACC and the native version of GTC-P. Associated results show that in the OpenACC version, the kernel with irregular memory access becomes the main performance bottleneck due to poor data locality. To address this issue, we have applied two optimizations on the native version: (1) register level communication (RLC); and (2) an "asynchronization" strategy. With these two optimizations, the native version can achieve up to 2.5X speedup for the memory-bound kernel compared with the OpenACC version. In addition, we have now scaled GTC-P on 4,259,840 cores of TaihuLight and demonstrate performance comparisons with several world-leading supercomputers.
Year
DOI
Venue
2018
10.1109/CLUSTER.2018.00021
2018 IEEE International Conference on Cluster Computing (CLUSTER)
Keywords
Field
DocType
Sunway TaihuLight,GTC P,optimization,OpenACC
Kernel (linear algebra),Bottleneck,Locality,Supercomputer,Computer science,Parallel computing,Bandwidth (signal processing),Sunway TaihuLight,Speedup
Conference
ISSN
ISBN
Citations 
1552-5244
978-1-5386-8320-0
1
PageRank 
References 
Authors
0.38
13
7
Name
Order
Citations
PageRank
Linjin Cai120.74
Yichao Wang221.10
William Tang3172.31
Bei Wang452861.48
Stephane Ethier529131.10
Zhao Liu62510.73
James Lin7124.37