Abstract | ||
---|---|---|
We provide a case study of work-stealing, a popular method for run-time load balancing, on FPGAs. Following the Cederman-Tsigas implementation for GPUs, we synchronize work-items not with locks, mutexes or critical sections, but instead with the atomic operations provided by Altera's OpenCL SDK. We evaluate work-stealing for FPGAs by synthesizing a K-means clustering algorithm on an Altera P385 D5 board, both with work-stealing and with a statically-partitioned load. When block RAM utilization is maximised in both cases, we find that work-stealing leads to a 1.5x speedup. This demonstrates that the ability to do load balancing at run-time can outweigh the drawback of using `expensive' atomics on FPGAs. We hope that our case study will stimulate further research into the high-level synthesis of fine-grained, lock-free, concurrent programs. |
Year | DOI | Venue |
---|---|---|
2016 | 10.1145/2847263.2847343 | ACM/SIGDA International Symposium on Field-Programmable Gate Arrays |
Keywords | Field | DocType |
atomic operations, high-level synthesis, K-means clustering, load balancing, lock-free synchronization, parallelism | k-means clustering,Synchronization,Load balancing (computing),Computer science,Parallel computing,High-level synthesis,Field-programmable gate array,Real-time computing,Work stealing,Cluster analysis,Embedded system,Speedup | Conference |
Citations | PageRank | References |
15 | 0.69 | 13 |
Authors | ||
4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Nadesh Ramanathan | 1 | 19 | 3.84 |
John Wickerson | 2 | 35 | 7.17 |
Felix Winterstein | 3 | 94 | 8.00 |
George A. Constantinides | 4 | 1391 | 160.26 |