Abstract | ||
---|---|---|
GPUs have become the dominant computing platforms for many applications, while programming GPUs with the widely-used CUDA parallel programming model is difficult. As sequential C code is relatively easy to obtain either from legacy repositories or by manual implementation, automatically translating C to its parallel CUDA counterpart is promising to relieve the burden of GPU programming. However, because of huge differences between the sequential C and the parallel CUDA programming model, existing approaches fail to conduct the challenging auto-parallelized program translation. In this paper, we propose a learning-based framework, i.e., BabelTower, to address this problem. We first create a large-scale dataset consisting of compute-intensive function-level monolingual corpora. We further propose using back-translation with a discriminative reranker to cope with unpaired corpora and parallel semantic conversion. Experimental results show that BabelTower outperforms state-of-the-art by 1.79, 6.09, and 9.39 in terms of BLEU, CodeBLEU, and specifically designed ParaBLEU, respectively. The CUDA code generated by BabelTower attains a speedup of up to 347x over the sequential C code, and the developer productivity is improved by at most 3.8x. |
Year | Venue | DocType |
---|---|---|
2022 | International Conference on Machine Learning | Conference |
Citations | PageRank | References |
0 | 0.34 | 0 |
Authors | ||
13 |
Name | Order | Citations | PageRank |
---|---|---|---|
Yuanbo Wen | 1 | 1 | 0.68 |
Qi Guo | 2 | 169 | 17.04 |
Qiang Fu | 3 | 0 | 0.34 |
Xiaqing Li | 4 | 0 | 0.34 |
Jianxing Xu | 5 | 0 | 0.68 |
Yanlin Tang | 6 | 0 | 0.34 |
Yongwei Zhao | 7 | 3 | 2.05 |
Xing Hu | 8 | 0 | 1.69 |
Zidong Du | 9 | 574 | 29.68 |
Ling Li | 10 | 0 | 0.34 |
Chao Wang | 11 | 372 | 62.24 |
Xuehai Zhou | 12 | 551 | 77.54 |
Yunji Chen | 13 | 0 | 0.34 |