Abstract | ||
---|---|---|
The dragonfly topology is a popular choice for building high-radix, low-diameter, hierarchical networks with high-bandwidth links. On Cray installations of the dragonfly network, job placement policies and routing inefficiencies can lead to significant network congestion for a single job and multi-job workloads. In this paper, we explore the effects of job placement, parallel workloads and network configurations on network health to develop a better understanding of inter-job interference. We have developed a functional network simulator, Damselfly, to model the network behavior of Cray Cascade, and a visual analytics tool, DragonView, to analyze the simulation output. We simulate several parallel workloads based on five representative communication patterns on up to 131,072 cores. Our simulations and visualizations provide unique insight into the buildup of network congestion and present a trade-off between deployment dollar costs and performance of the network. |
Year | DOI | Venue |
---|---|---|
2016 | 10.1109/IPDPS.2016.123 | 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS) |
Keywords | Field | DocType |
dragonfly network,congestion,inter-job interference,simulation,visual analytics | Software deployment,Computer science,Parallel computing,Computer network,Visual analytics,Network simulation,Network topology,Bandwidth (signal processing),Network congestion,Network behavior,Network traffic control,Distributed computing | Conference |
ISSN | ISBN | Citations |
1530-2075 | 978-1-5090-2141-3 | 10 |
PageRank | References | Authors |
0.53 | 9 | 5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Abhinav Bhatele | 1 | 625 | 43.42 |
Nikhil Jain | 2 | 321 | 24.01 |
Yarden Livnat | 3 | 607 | 50.10 |
Valerio Pascucci | 4 | 3241 | 192.33 |
Peer-Timo Bremer | 5 | 1446 | 82.47 |