Title
An Overhead Analysis of MPI Profiling and Tracing Tools
Abstract
BSTRACTMPI performance analysis tools are important instruments for finding performance bottlenecks in large-scale MPI applications. These tools commonly support either the profiling or the tracing of parallel applications. Depending on the type of analysis, the use of such a performance analysis tool may entail a significant runtime overhead on the monitored parallel application. However, overheads can occur in different stages of the performance analysis with varying severity, e.g., the overhead when initializing an MPI context is typically less problematic than when monitoring a high number of short-lived MPI function calls. In this work, we precisely define the different types of overheads that performance engineers may encounter when applying performance analysis tools. In the context of performance tuning, it is crucial to avoid delaying individual events (e.g., function calls) when monitoring MPI applications, as otherwise performance bottlenecks may not show up in the same spot as when running the applications without applying a performance analysis tool. We empirically examine the different types of overheads associated with popular performance analysis tools for a set of well-known proxy applications and categorize the tools according to our findings. Our study shows that although the investigated MPI profiling and tracing tools exhibit a rather unique overhead footprint, they hardly influence the net time of an MPI application, which is the time between the Init and Finalize calls. Performance engineers should be aware of all types of overheads associated with each tool to avoid very costly batch jobs.
Year
DOI
Venue
2022
10.1145/3526063.3535353
High Performance Distributed Computing
DocType
Citations 
PageRank 
Conference
0
0.34
References 
Authors
0
4
Name
Order
Citations
PageRank
Sascha Hunold112120.11
Jordy I. Ajanohoun200.34
Ioannis Vardas300.34
Jesper Larsson Träff412.04