Title
Scalable Collective Communication On The Asci Q Machine
Abstract
Scientific codes spend a considerable part of their run time executing collective communication operations. Such operations can also be critical for efficient resource management in large-scale machines. Therefore, scalable collective communication is a key factor to achieve good performance in large-scale parallel computers.In this paper we describe the performance and scalability of some common collective communication patterns on the ASCI Q machine. Experimental results conducted on a 1024-node/4096-processor-segment show that the network is fast and scalable. The network is able to barrier-synchronize in a few tens of mus, perform a broadcast with an aggregate bandwidth of more than 100 GB/s and sustain heavy hot-spot traffic with a limited performance degradation.
Year
DOI
Venue
2003
10.1109/CONECT.2003.1231478
HOT INTERCONNECTS 11
Keywords
Field
DocType
hot spot,parallel computer,synchronisation,network topology
Resource management,Broadcasting,Synchronization,Computer science,Parallel computing,Computer network,Collective communication,Network topology,Bandwidth (signal processing),Distributed computing,Scalability
Conference
Citations 
PageRank 
References 
15
0.96
9
Authors
4
Name
Order
Citations
PageRank
Fabrizio Petrini12050165.82
Juan Fernandez226923.17
Eitan Frachtenberg3106085.08
Salvador Coll460957.12