Title
Supporting Mixed-domain Mixed-precision Matrix Multiplication within the BLIS Framework
Abstract
AbstractWe approach the problem of implementing mixed-datatype support within the general matrix multiplication (gemm) operation of the BLAS-like Library Instantiation Software framework, whereby each matrix operand A, B, and C may be stored as single- or double-precision real or complex values. Another factor of complexity, whereby the matrix product and accumulation are allowed to take place in a precision different from the storage precisions of either A or B, is also discussed. We first break the problem into orthogonal dimensions, considering the mixing of domains separately from mixing precisions. Support for all combinations of matrix operands stored in either the real or complex domain is mapped out by enumerating the cases and describing an implementation approach for each. Supporting all combinations of storage and computation precisions is handled by typecasting the matrices at key stages of the computation—during packing and/or accumulation, as needed. Several optional optimizations are also documented. Performance results gathered on a 56-core Marvell ThunderX2 and a 52-core Intel Xeon Platinum demonstrate that high performance is mostly preserved, with modest slowdowns incurred from unavoidable typecast instructions. The mixed-datatype implementation confirms that combinatorial intractability is avoided, with the framework relying on only two assembly microkernels to implement 128 datatype combinations.
Year
DOI
Venue
2021
10.1145/3402225
ACM Transactions on Mathematical Software
Keywords
DocType
Volume
Dense, linear algebra, DLA, high-performance, real, complex, mixed, datatype, type, domain, precision, matrix, multiplication, microkernel, BLAS, BLIS, libraries, framework
Journal
47
Issue
ISSN
Citations 
2
0098-3500
0
PageRank 
References 
Authors
0.34
0
3
Name
Order
Citations
PageRank
Field G. Van Zee131223.19
Devangi N. Parikh200.34
Robert A. van de Geijn3774.95