Title
Accelerator-Centered Programming on Heterogeneous Systems
Abstract
Parallel many cores contribute to heterogeneous architectures and achieve high computation throughput. Working as coprocessors and connected to general-purpose CPUs via PCIe, those special-purpose cores usually work as float computing accelerators (ACC). The popular programming models typically offload the computing intensive parts to accelerator then aggregate results, which would result in a great amount of data transfer via PCIe. In this paper, we introduce an ACC-centered model to leverage the limited bandwidth of PCIe, increase performance, reduce idle time of ACC. In order to realize dada-near-computing, our ACC-centered model arms to program centered on ACC and the control intensive parts are offloaded to CPU. Both CPU and ACC are devoted to higher performance with their architect feature. Validation on the Tianhe-2 supercomputer shows that the implementation of ACC-centered LU competes with the highly optimized Intel MKL hybrid implementation and achieves about 5× speedup versus the CPU version.
Year
DOI
Venue
2016
10.1109/PDCAT.2016.041
2016 17th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)
Keywords
Field
DocType
Heterogeneous,Accelerator-centered,LU,MIC,Tianhe-2
Tianhe-2,Central processing unit,Programming paradigm,Supercomputer,Computer science,Parallel computing,Real-time computing,PCI Express,Throughput,Coprocessor,Speedup,Distributed computing
Conference
ISBN
Citations 
PageRank 
978-1-5090-5082-6
0
0.34
References 
Authors
12
3
Name
Order
Citations
PageRank
Cheng Chen182.21
Yunfei Du27214.62
Canqun Yang318829.39