Title
Exploiting Hierarchical Parallelism Using UPC
Abstract
High-Performance Computing (HPC) systems are increasingly moving towards an architecture that is deeply hierarchical. However, the execution model with single-level parallelism embodied in legacy parallel programming models falls short in exploiting the multi-level parallelism opportunities in both hardware architectures and applications. This makes the use of richer execution models imperative in order to fully exploit hierarchical parallelism. Partitioned Global Address Space (PGAS) languages such as Unified Parallel C (UPC) are growing in popularity because of their ability to provide a globally shared address space with locality awareness. While UPC provides a welcome improvement over message passing libraries, users still program with a single level of parallelism in the context of SPMD. In this paper, we explore two explicit hierarchical programming approaches based on UPC to improve programmability and performance on hierarchical architectures. The first approach orchestrates computations on multiple sets of thread groups, the second approach extends UPC with nested, shared memory multi-threading. This paper presents a detailed description of proposed approaches and demonstrates their effectiveness in the context of the NAS Parallel Benchmarks and the Unbalanced Tree Search (UTS). Experimental results indicate that the hierarchical model not only provides greater expressive power but also enhances performance, all three benchmarks exceed the performance of the standard UPC implementations after being incrementally enhanced with hierarchical parallelism.
Year
DOI
Venue
2011
10.1109/IPDPS.2011.273
IPDPS Workshops
Keywords
Field
DocType
hierarchical architecture,nas parallel benchmarks,single-level parallelism,multi-level parallelism opportunity,approach orchestrates computation,unified parallel c,explicit hierarchical programming,hierarchical parallelism,exploiting hierarchical parallelism,hierarchical model,standard upc implementation,parallel programming,multi threading,hardware architecture,electronics packaging,parallel processing,instruction sets,software maintenance,programming,partitioned global address space,high performance computing
Instruction-level parallelism,SPMD,Computer architecture,Implicit parallelism,Unified Parallel C,Task parallelism,Computer science,Parallel computing,Data parallelism,Execution model,Partitioned global address space
Conference
Citations 
PageRank 
References 
1
0.36
11
Authors
3
Name
Order
Citations
PageRank
Lingyuan Wang1383.29
Saumil Merchant2172.64
Tarek El-Ghazawi342744.88