Title | ||
---|---|---|
Overcoming extreme-scale reproducibility challenges through a unified, targeted, and multilevel toolset |
Abstract | ||
---|---|---|
Reproducibility, the ability to repeat program executions with the same numerical result or code behavior, is crucial for computational science and engineering applications. However, non-determinism in concurrency scheduling often hampers achieving this ability on high performance computing (HPC) systems. To aid in managing the adverse effects of non-determinism, prior work has provided techniques to achieve bit-precise reproducibility, but most of them focus only on small-scale parallelism. While scalable techniques recently emerged, they are disparate and target special purposes, e.g., single-schedule domains. On current systems with O(106) compute cores and future ones with O(109), any technique that does not embrace a unified, targeted, and multilevel approach will fall short of providing reproducibility. In this paper, we argue for a common toolset that embodies this approach, where programmers select and compose complementary tools and can effectively, yet scalably, analyze, control, and eliminate sources of non-determinism at scale. This allows users to gain reproducibility only to the levels demanded by specific code development needs. We present our research agenda and ongoing work toward this goal. |
Year | DOI | Venue |
---|---|---|
2013 | 10.1145/2532352.2532357 | SE-HPCCSE@SC |
Field | DocType | Citations |
Computational Science and Engineering,Reproducibility,Extreme scale,Software engineering,Supercomputer,Computer science,Concurrency,Scheduling (computing),Software requirements specification,Scalability | Conference | 4 |
PageRank | References | Authors |
0.41 | 11 | 6 |
Name | Order | Citations | PageRank |
---|---|---|---|
Dong H. Ahn | 1 | 325 | 22.61 |
Gregory L. Lee | 2 | 199 | 14.30 |
Ganesh Gopalakrishnan | 3 | 1619 | 130.11 |
Zvonimir Rakamarić | 4 | 135 | 7.41 |
Martin Schulz | 5 | 2227 | 129.64 |
Ignacio Laguna | 6 | 239 | 24.56 |