Title | ||
---|---|---|
PETRA: A 22nm 6.97TFLOPS/W AIB-Enabled Configurable Matrix and Convolution Accelerator Integrated with an Intel Stratix 10 FPGA |
Abstract | ||
---|---|---|
PETRA is a configurable FP16 matrix multiplication and convolution accelerator designed to be 2.5D integrated using Advanced Interface Bus (AIB). PETRA is built upon four 16×16 systolic arrays, but it employs a configurable H-tree accumulation to improve both the latency and the utilization by up to 8×. A 22nm 3.04mm
<sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup>
PETRA prototype provides 1.433TFLOPS in computing matrix-matrix multiplication (MMM) and convolution (conv) at 0.88V, and it achieves a 6.97TFLOPS/W peak efficiency at 0.7V. PETRA is integrated with an Intel Stratix 10 FPGA in a multi-chip package (MCP) to provide the flexibility of FPGA and the performance and efficiency of PETRA. |
Year | DOI | Venue |
---|---|---|
2021 | 10.23919/VLSICircuits52068.2021.9492517 | 2021 Symposium on VLSI Circuits |
Keywords | DocType | ISSN |
multichip package,matrix-matrix multiplication,AIB,configurable FP16 matrix multiplication,PETRA prototype,configurable H-tree accumulation,16×16 systolic arrays,Advanced Interface Bus,convolution accelerator,Intel Stratix 10 FPGA,size 22.0 nm,voltage 0.7 V,voltage 0.88 V | Conference | 2158-5601 |
ISBN | Citations | PageRank |
978-1-6654-4766-9 | 0 | 0.34 |
References | Authors | |
0 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Sung-Gun Cho | 1 | 2 | 2.06 |
Wei Tang | 2 | 1 | 0.69 |
Chester Liu | 3 | 7 | 0.78 |
Zhengya Zhang | 4 | 502 | 48.41 |