Title
Towards Runtime Analytics in a Parallel Performance System
Abstract
Developers of scientific simulations use parallel performance systems to measure, analyze, and tune their applications on large-scale HPC machines. In the majority of these performance systems, the analysis takes place offline. More consequentially, if runtime analytics are desired, performance measurement infrastructures need to be designed and implemented in such a way to make it possible. We investigate the question of how to create runtime analytics capabilities by considering this objective in a reference platform - the TAU Performance System. Our research work identifies general issues of concern and describes how these can be addressed in a new TAUbased analytics framework. Several case studies are proposed as different analytics examples. These are prototyped, evaluated on HPC machines, and discussed. The outcomes of the research study suggest that runtime analytics has merit. Furthermore, we believe the approach could directly carry forward to other parallel performance systems.
Year
DOI
Venue
2019
10.1109/HPCS48598.2019.9188097
2019 International Conference on High Performance Computing & Simulation (HPCS)
Keywords
DocType
ISBN
performance,analytics,scalable computing
Conference
978-1-7281-4485-6
Citations 
PageRank 
References 
0
0.34
9
Authors
5
Name
Order
Citations
PageRank
Allen D. Malony11787190.85
Srinivasan Ramesh200.34
Kevin A. Huck311914.53
Chad Wood432.09
Sameer Shende51351116.40