Abstract | ||
---|---|---|
We present techniques for characterizing bandwidth and congestion characteristics of supercomputer High-Speed Networks (HSN). By utilizing a link-level perspective, we gain generality over analyses which are tied to specific topologies. We illustrate these techniques using five months of a Blue Waters production dataset consisting of network utilization and congestion counters. We find that: i) execution time of the communication-heavy applications is highly correlated to network stalls observed in the network topology and increase in application runtime can be as high as 1.7x with nominal increase in stalls, ii) heterogeneity in the available link bandwidth in the network can lead to backpressure and congestion even when the network is not underprovisioned, and (iii) links connected to I/O nodes are no more likely to observe congestion during operational hours than any other link in the system. |
Year | DOI | Venue |
---|---|---|
2018 | 10.1109/CLUSTER.2018.00072 | 2018 IEEE International Conference on Cluster Computing (CLUSTER) |
Keywords | Field | DocType |
network congestion,congestion characterization,network congestion visualization | Supercomputer,Computer science,Link level,Network topology,Bandwidth (signal processing),Execution time,Network congestion,Benchmark (computing),Blue Waters,Distributed computing | Conference |
ISSN | ISBN | Citations |
1552-5244 | 978-1-5386-8320-0 | 1 |
PageRank | References | Authors |
0.36 | 8 | 5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Saurabh Jha | 1 | 3 | 0.72 |
Jim M. Brandt | 2 | 70 | 10.20 |
Ann C. Gentile | 3 | 37 | 7.91 |
Zbigniew Kalbarczyk | 4 | 1896 | 159.48 |
Ravishankar K. Iyer | 5 | 3489 | 504.32 |