Title | ||
---|---|---|
Using Task-Based Parallelism Directly On The Gpu For Automated Asynchronous Data Transfer |
Abstract | ||
---|---|---|
We present a framework, based on the QuickSched[1] library, that implements priority-aware task-based parallelism directly on CUDA GPUs. This allows large computations with complex data dependencies to be executed in a single GPU kernel call, removing any synchronization points that might otherwise be required between kernel calls. Using this paradigm, data transfers to and from the GPU are modelled as load and unload tasks. These tasks are automatically generated and executed alongside the rest of the computational tasks, allowing fully asynchronous and concurrent data transfers. We implemented a tiled-QR decomposition, and a Barnes-Hut gravity calculation, both of which show significant improvement when utilising the task-based setup, effectively eliminating any latencies due to data transfers between the GPU and the CPU. This shows that task-based parallelism is a valid alternative programming paradigm on GPUs, and can provide significant gains from both a data transfer and ease-of-use perspective. |
Year | DOI | Venue |
---|---|---|
2015 | 10.3233/978-1-61499-621-7-683 | PARALLEL COMPUTING: ON THE ROAD TO EXASCALE |
Keywords | Field | DocType |
Task-based parallelism, general-purpose GPU computing, Asynchronous data transfer | Instruction-level parallelism,Asynchronous communication,Computer architecture,Data transmission,Task parallelism,Computer science,Parallel computing,Theoretical computer science,Data parallelism | Conference |
Volume | ISSN | Citations |
27 | 0927-5452 | 0 |
PageRank | References | Authors |
0.34 | 0 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Aidan B. G. Chalk | 1 | 4 | 2.18 |
Pedro Gonnet | 2 | 89 | 13.43 |
Matthieu Schaller | 3 | 4 | 2.85 |