Title
Evaluating and Enhancing Intel® Stratix® 10 FPGAs for Persistent Real-Time AI.
Abstract
Interactive intelligent services (e.g., smart web search) are becoming essential datacenter workloads. They rely on data-intensive artificial intelligence (AI) algorithms that do not use batch computation due to their tight latency constraints. Since off-chip data accesses have higher latency and energy consumption than on-chip accesses, a persistent AI approach with the entire model stored in on-chip memory is becoming the new norm for real-time AI. This approach is the cornerstone of Microsoft's Brainwave FPGA-based AI cloud and was recently added to Nvidia's cuDNN library. In this work, we implement, optimize and evaluate a Brainwave-like neural processing unit (NPU) on a large Stratix-10 FPGA. We benchmark it against a large Nvidia Volta GPU running cuDNN persistent AI kernels. Across real-time persistent RNN, GRU, and LSTM workloads, we show that Stratix-10 offers ~3× (FP32) and ~10× (INT8) better latency than GPU (FP32), which uses only ~6% of its peak throughput. Then, we propose TensorRAM, an ASIC chiplet for persistent AI that is 2.5D integrated with an FPGA in the same package. TensorRAM enhances the on-chip memory capacity and bandwidth, with enough multi-precision INT8/4/2/1 throughput to match that bandwidth. Multiple TensorRAMs can be integrated with Stratix-10. Our evaluation shows that a small 32-mm2 TensorRAM on 10nm offers 64MB of SRAMs with 32TB/s on-chiplet bandwidth and 64 TOP/s (INT8). A small Stratix-10 with a TensorRAM (INT8) offers 16× better latency and 34× energy efficiency compared to GPU (FP32). Overall, Stratix-10 with TensorRAM offers compelling and scalable persistent AI solutions.
Year
DOI
Venue
2019
10.1145/3289602.3293943
FPGA
DocType
ISBN
Citations 
Conference
978-1-4503-6137-8
0
PageRank 
References 
Authors
0.34
0
16
Name
Order
Citations
PageRank
Eriko Nurvitadhi139933.08
Dongup Kwon2254.92
Ali Jafari3437.04
Andrew Boutros483.02
Jaewoong Sim538417.25
Phillip Tomson660.94
Huseyin Sumbul762.29
Gregory K. Chen829832.96
Phil V. Knag900.34
Raghavan Kumar107312.56
Ram Krishnamurthy1165074.63
Debbie Marr1217512.39
Sergey Gribok1393.78
Bogdan Pasca1432528.69
Martin Langhammer1510420.22
Aravind Dasu16104.47