A practical approach to DOACROSS parallelization - Citegraph

Paper Info

Title
A practical approach to DOACROSS parallelization

Abstract
Loops with cross-iteration dependences (doacross loops) often contain significant amounts of parallelism that can potentially be exploited on modern manycore processors. However, most production-strength compilers focus their automatic parallelization efforts on doall loops, and consider doacross parallelism to be impractical due to the space inefficiencies and the synchronization overheads of past approaches. This paper presents a novel and practical approach to automatically parallelizing doacross loops for execution on manycore-SMP systems. We introduce a compiler-and-runtime optimization called dependence folding that bounds the number of synchronization variables allocated per worker thread (processor core) to be at most the maximum depth of a loop nest being considered for automatic parallelization. Our approach has been implemented in a development version of the IBM XL Fortran V13.1 commercial parallelizing compiler and runtime system. For four benchmarks where automatic doall parallelization was largely ineffective (speedups of under 2×), our implementation delivered speedups of 6.5×, 9.0×, 17.3×, and 17.5× on a 32-core IBM Power7 SMP system, thereby showing that doacross parallelization can be a valuable technique to complement doall parallelization.

Year	DOI	Venue
2012	10.1007/978-3-642-32820-6_23	Euro-Par
Field	DocType	Volume
Synchronization,IBM,Computer science,Parallel computing,Fortran,Thread (computing),Compiler,Multi-core processor,Automatic parallelization,Distributed computing,Runtime system	Conference	7484
ISSN	Citations	PageRank
0302-9743	4	0.42
References	Authors
11	6

Authors (6 rows)

Cited by (4 rows)

References (11 rows)

Name	Order	Citations	PageRank
Priya Unnikrishnan	1	200	14.67
Jun Shirako	2	433	34.56
Kit Barton	3	4	3.80
Sanjay Chatterjee	4	61	4.41
Raul Silvera	5	179	10.74
Vivek Sarkar	6	4318	409.41

1