Abstract | ||
---|---|---|
With the continual development of multi and many-core architectures, there is a constant need for architecture-specific tuning of application-codes in order to realize high computational performance and energy efficiency, closer to the theoretical peaks of these architectures. In this paper, we present optimization and tuning of HipGISAXS, a parallel X-ray scattering simulation code [9], on various massively-parallel state-of-the-art supercomputers based on multi and many-core processors. In particular, we target clusters of general-purpose multi-cores such as Intel Sandy Bridge and AMD Magny Cours, and many-core accelerators like Nvidia Kepler GPUs and Intel Xeon Phi coprocessors. We present both high-level algorithmic and low-level architecture-aware optimization and tuning methodologies on these platforms. We cover a detailed performance study of our codes on single and multiple nodes of several current top-ranking supercomputers. Additionally, we implement autotuning of many of the algorithmic and optimization parameters for dynamic selection of their optimal values to ensure high-performance and high-efficiency. |
Year | DOI | Venue |
---|---|---|
2013 | 10.1007/978-3-319-10214-6_11 | HIGH PERFORMANCE COMPUTING SYSTEMS: PERFORMANCE MODELING, BENCHMARKING AND SIMULATION |
Field | DocType | Volume |
Cluster (physics),Xeon Phi,Efficient energy use,Computer science,Parallel computing,Kernel fusion,Kepler,Coprocessor | Conference | 8551 |
ISSN | Citations | PageRank |
0302-9743 | 1 | 0.40 |
References | Authors | |
1 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Abhinav Sarje | 1 | 35 | 5.71 |
Xiaoye S. Li | 2 | 1042 | 98.22 |
Alexander Hexemer | 3 | 5 | 2.71 |