Abstract | ||
---|---|---|
Interconnects in emerging high performance computing systems feature hardware support for one-sided, asynchronous communication and global address space programming models in order to improve parallel efficiency and productivity by allowing communication and computation overlap and out-of-order delivery. In practice though, complex interactions between the software stack and the communication hardware make it challenging to obtain optimum performance for a full application expressed with a one-sided programming paradigm. Here, we present a proof-of-concept study for an autotuning framework that instantiates hybrid kernels based on refactored codes using available communication libraries or languages on a Cray XE6 and a SGI Altix UV 1000. We validate our approach by improving performance for bandwidth- and latency-bound kernels of interest in quantum physics and astrophysics by up to 35% and 80% respectively. |
Year | DOI | Venue |
---|---|---|
2012 | 10.1145/2381056.2381075 | ACM SIGMETRICS Performance Evaluation Review |
Keywords | Field | DocType |
order delivery,communication method,sgi altix uv,communication hardware,one-sided programming paradigm,global address space programming,available communication library,hardware support,cray xe6,high performance computing system,optimum performance,asynchronous communication,rdma,pgas,mmps,programming model,out of order,proof of concept,code generation,programming paradigm,quantum physics | Asynchronous communication,Programming paradigm,Supercomputer,Computer science,Parallel computing,Code generation,Bandwidth (signal processing),Software,Remote direct memory access,Partitioned global address space,Distributed computing | Journal |
Volume | Issue | Citations |
40 | 2 | 1 |
PageRank | References | Authors |
0.39 | 11 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Adrian Tineo | 1 | 4 | 1.47 |
Sadaf R. Alam | 2 | 278 | 30.85 |
Thomas C. Schulthess | 3 | 106 | 15.16 |