Title
Quiet Neighborhoods: Key to Protect Job Performance Predictability
Abstract
Interference of nearby jobs has been recently identified as the dominant reason for the high performance variability of parallel applications running on High Performance Computing (HPC) systems. Typically, HPC systems are dynamic with multiple jobs coming and leaving in an unpredictable fashion, sharing simultaneously the system interconnection network. In such environment contention for network resources is causing random stalls in the progress of application execution degrading application and system performance overall. Eliminating job interactions in their neighbourhoods is key for guaranteeing performance predictability of applications. In this paper we are proposing the concept of quiet neighbourhoods that significantly reduce job interactions. Quiet neighbourhoods are created by the system resource manager in two phases. First, multiple virtual network blocks are defined on the top of the physical network resources based on typical workload distributions. Second, newly arriving jobs are allocated in these virtual blocks based on their size.
Year
DOI
Venue
2015
10.1109/IPDPS.2015.87
International Parallel & Distributed Processing Symposium
Keywords
Field
DocType
Performance predictability, Performance reproducibility, Applications' interference, Network contention, Resource management, Virtual network topologies, Infiniband
QUIET,Resource management,Virtual network,Predictability,Supercomputer,InfiniBand,Workload,Computer science,Parallel computing,Computer network,Job performance,Distributed computing
Conference
ISSN
Citations 
PageRank 
1530-2075
15
0.63
References 
Authors
15
6
Name
Order
Citations
PageRank
Ana Jokanovic1454.95
José Carlos Sancho238229.97
Germán Rodríguez311610.04
Alejandro Lucero4150.63
Cyriel Minkenberg539939.21
Jesús Labarta61862165.09