Abstract | ||
---|---|---|
This paper presents a dynamically heterogeneous architecture use-case that is both realistic and favorable for distributed work-stealing in regular parallel applications. Using a straightforward implementation of distributed dense matrix multiplication in X10's Global Load Balancing (GLB) library, we show that moderate differences in node processing power allow work-stealing to significantly outperform a standard static schedule such as SUMMA. It also scales comparably on up to 128 cores. |
Year | DOI | Venue |
---|---|---|
2016 | 10.1145/2931028.2931035 | X10@PLDI |
DocType | ISBN | Citations |
Conference | 978-1-4503-4386-2 | 0 |
PageRank | References | Authors |
0.34 | 1 | 2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Brendan Sheridan | 1 | 4 | 0.93 |
Jeremy T. Fineman | 2 | 587 | 36.10 |