Title
Communication lower bounds for distributed-memory matrix multiplication
Abstract
We present lower bounds on the amount of communication that matrix multiplication algorithms must perform on a distributed-memory parallel computer. We denote the number of processors by P and the dimension of square matrices by n. We show that the most widely used class of algorithms, the so-called two-dimensional (2D) algorithms, are optimal, in the sense that in any algorithm that only uses O(n^2/P) words of memory per processor, at least one processor must send or receive @W(n^2/P^1^/^2) words. We also show that algorithms from another class, the so-called three-dimensional (3D) algorithms, are also optimal. These algorithms use replication to reduce communication. We show that in any algorithm that uses O(n^2/P^2^/^3) words of memory per processor, at least one processor must send or receive @W(n^2/P^2^/^3) words. Furthermore, we show a continuous tradeoff between the size of local memories and the amount of communication that must be performed. The 2D and 3D bounds are essentially instantiations of this tradeoff. We also show that if the input is distributed across the local memories of multiple nodes without replication, then @W(n^2) words must cross any bisection cut of the machine. All our bounds apply only to conventional @Q(n^3) algorithms. They do not apply to Strassen's algorithm or other o(n^3) algorithms.
Year
DOI
Venue
2004
10.1016/j.jpdc.2004.03.021
J. Parallel Distrib. Comput.
Keywords
Field
DocType
distributed memory,matrix multiplication algorithm,continuous tradeoff,matrix multiplication,distributed-memory matrix multiplication,communication,bisection cut,multiple node,lower bound,communication lower bound,distributed-memory parallel computer,lower bounds,local memory,algorithms use replication,square matrix,lower b ounds
Analysis of parallel algorithms,Matrix calculus,Upper and lower bounds,Computer science,Parallel computing,Square matrix,Distributed memory,Strassen algorithm,Local memories,Matrix multiplication,Distributed computing
Journal
Volume
Issue
ISSN
64
9
Journal of Parallel and Distributed Computing
Citations 
PageRank 
References 
104
4.41
26
Authors
3
Search Limit
100104
Name
Order
Citations
PageRank
Dror Irony11769.56
Sivan Toledo21995181.13
Alexander Tiskin322015.50