Title
A systematic study on benchmarking AI inference accelerators
Abstract
AI inference accelerators have drawn extensive attention. But none of the previous work performs a holistic and systematic benchmarking on AI inference accelerators. First, an end-to-end AI inference pipeline consists of six stages on both host and accelerators. However, previous work mainly evaluates hardware execution performance, which is only one stage on accelerators. Second, there is a lack of a systematic evaluation of different optimizations on AI inference accelerators. Along with six representative AI workloads and a typical AI inference accelerator–Diannao based on Cambricon ISA, we implement five frequently-used AI inference optimizations as user-configurable hyper-parameters. We explore the optimization space by sweeping the hyper-parameters and quantifying each optimization’s effect on the chosen metrics. We also provide cross-platform comparisons between Diannao and traditional platforms (Intel CPUs and Nvidia GPUs). Our evaluation provides several new observations and insights, which sheds light on the comprehensive understanding of AI inference accelerators’ performance and instructs the co-design of the upper-level optimizations and underlying hardware architecture.
Year
DOI
Venue
2022
10.1007/s42514-022-00105-z
CCF Transactions on High Performance Computing
Keywords
DocType
Volume
AI accelerators, Inference, Performance evaluation, Optimization
Journal
4
Issue
ISSN
Citations 
2
2524-4922
0
PageRank 
References 
Authors
0.34
2
9
Name
Order
Citations
PageRank
Jiang Zihan100.34
Li Jiansong200.34
Liu Fangxin300.34
Wanling Gao429919.12
Lei Wang557746.85
Lan Chuanxin600.34
Tang Fei700.34
Liu Lei800.34
Li Tao900.34