Abstract | ||
---|---|---|
Driven by increasing specialization, multicore integration will soon enable large-scale chip multiprocessors (CMPs) with many processing cores. In order to take advantage of increasingly parallel hardware, independent tasks must be expressed at a fine level of granularity to maximize the available parallelism and thus potential speedup. However, the efficiency of this approach depends on the runtime system, which is responsible for managing and distributing the tasks. In this paper, we present a hierarchically distributed task pool for task parallel programming on Cell processors. By storing subsets of the task pool in the local memories of the Synergistic Processing Elements (SPEs), access latency and thus overheads are greatly reduced. Our experiments show that only a worker-centric runtime system that utilizes the SPEs for both task creation and execution is suitable for exploiting fine-grained parallelism. |
Year | DOI | Venue |
---|---|---|
2010 | 10.1007/978-3-642-15291-7_18 | Euro-Par (2) |
Keywords | Field | DocType |
cell processor,available parallelism,independent task,fine-grained parallelism,parallel hardware,task creation,worker-centric runtime system,task pool,runtime system,task parallel programming | Load balancing (computing),Task parallelism,Computer science,Parallel computing,Chip,Data parallelism,Granularity,Multi-core processor,Distributed computing,Speedup,Runtime system | Conference |
Volume | ISSN | ISBN |
6272 | 0302-9743 | 3-642-15290-2 |
Citations | PageRank | References |
0 | 0.34 | 8 |
Authors | ||
3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Ralf Hoffmann | 1 | 0 | 0.34 |
Andreas Prell | 2 | 7 | 2.41 |
Thomas Rauber | 3 | 415 | 64.60 |