Abstract | ||
---|---|---|
Interconnect Network is the key component in high performance computing system. With the incoming era of Exa-Scale (10
<sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">18</sup>
FLOPS) computing, designing large scale interconnect networks is facing with serious challenges in network management and fault tolerance. To construct higher performance and more reliable interconnect networks, we propose an efficient and intelligent network management architecture for indirect interconnect networks. This paper emphatically introduces the network management architecture, the in-band management channels in the Network Interface Chips (NIC) and Network Routing Chips (NRC) respectively, efficient centralized network management approach, distributed intelligent faulttolerant routing management, and so on. Based on the prototype system, the Control and Status Registers (CSR) accessing latency performance of in-band network management is evaluated. Also based on a customized simulation model for indirect interconnect networks, the typical distributed fault-tolerant scenes are tested. The experiment results show the in-band network management can averagely achieves 716 times improvements than the out of-band network management. Moreover, by running heuristic algorithms to automatically reconstruct routing tables for failure links or routers, the intellectual network management Engine can achieve fault-tolerant routing to maximizing network performance. |
Year | DOI | Venue |
---|---|---|
2019 | 10.1109/ICPADS47876.2019.00055 | 2019 IEEE 25th International Conference on Parallel and Distributed Systems (ICPADS) |
Keywords | Field | DocType |
high performance computing system,interconnect network,network management,fault tolerate | Supercomputer,Computer science,Fault tolerance,Intelligent Network,Routing table,Interconnection,Network management,Network interface,Distributed computing,Network performance | Conference |
ISSN | ISBN | Citations |
1521-9097 | 978-1-7281-2584-8 | 0 |
PageRank | References | Authors |
0.34 | 5 | 5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Jijun Cao | 1 | 2 | 3.79 |
Ming-che Lai | 2 | 3 | 1.87 |
Zhang Luo | 3 | 0 | 0.34 |
Jiaqing Xu | 4 | 57 | 6.44 |
Zhengbin Pang | 5 | 49 | 11.18 |