Title | ||
---|---|---|
Resiliency of HPC Interconnects: A Case Study of Interconnect Failures and Recovery in Blue Waters. |
Abstract | ||
---|---|---|
Availability of the interconnection network in high-performance computing (HPC) systems is fundamental to sustaining the continuous execution of applications at scale. When failures occur, interconnect recovery mechanisms orchestrate complex operations to recover network connectivity between the nodes. As the scale and design complexity of HPC systems increase, so does the system's susceptibility ... |
Year | DOI | Venue |
---|---|---|
2018 | 10.1109/TDSC.2017.2737537 | IEEE Transactions on Dependable and Secure Computing |
Keywords | Field | DocType |
Data security,Network security,Fault tolerance,Fault diagnosis,Multiprocessor interconnection,Data analysis | Psychological resilience,Network connectivity,Supercomputer,Computer science,Real-time computing,Fault tolerance,Interconnection,Multiprocessor interconnection,Blue Waters,Distributed computing | Journal |
Volume | Issue | ISSN |
15 | 6 | 1545-5971 |
Citations | PageRank | References |
2 | 0.36 | 0 |
Authors | ||
7 |
Name | Order | Citations | PageRank |
---|---|---|---|
Saurabh Jha | 1 | 3 | 0.72 |
Valerio Formicola | 2 | 60 | 7.90 |
Catello Di Martino | 3 | 219 | 14.78 |
Mark Dalton | 4 | 2 | 0.36 |
William T. C. Kramer | 5 | 156 | 11.36 |
Zbigniew Kalbarczyk | 6 | 1896 | 159.48 |
Ravishankar K. Iyer | 7 | 3489 | 504.32 |