Title | ||
---|---|---|
System-Level Design and Integration of a Prototype AR/VR Hardware Featuring a Custom Low-Power DNN Accelerator Chip in 7nm Technology for Codec Avatars |
Abstract | ||
---|---|---|
Augmented Reality / Virtual Reality (AR/VR) devices aim to connect people in the Metaverse with photorealistic virtual avatars, referred to as “Codec Avatars”. Delivering a high visual performance for Codec Avatar workloads, however, is a challenging task for mobile SoCs as AR/VR devices have limited power and form factor constraints. On-device, local, near-sensor processing provides the best system-level energy-efficiency and enables strong security and privacy features in the long run. In this work, we present a custom-built, prototype small-scale mobile SoC that achieves energy-efficient performance for running eye gaze extraction of the Codec Avatar model. The test-chip, fabricated in 7nm technology node, features a Neural Network (NN) accelerator consisting of a 1024 Multiply-Accumulate (MAC) array, 2MB on-chip SRAM, and a 32bit RISC-V CPU. The featured test-chip is integrated on a prototype mobile VR headset to run the Codec Avatar application. This work aims to show the full stack design considerations of system-level integration, hardware-aware model customization, and circuit-level acceleration to meet the challenging mobile AR/VR SoC specifications for a Codec Avatar demonstration. By re-architecting the Convolutional NN (CNN) based eye gaze extraction model and tailoring it for the hardware, the entire model fits on the chip to mitigate system-level energy and latency cost of off-chip memory accesses. By efficiently accelerating the convolution operation at the circuit-level, the presented prototype SoC achieves 30 frames per second performance with low-power consumption at low form factors. With the full-stack design considerations presented in this work, the featured test-chip consumes 22.7mW power to run inference on the entire CNN model in 16.5ms from input to output for a single sensor image. As a result, the test-chip achieves 375 µJ/frame/eye energy-efficiency within a 2.56 mm
<sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup>
silicon area. |
Year | DOI | Venue |
---|---|---|
2022 | 10.1109/CICC53496.2022.9772810 | 2022 IEEE Custom Integrated Circuits Conference (CICC) |
Keywords | DocType | ISSN |
system-level design,custom low-power DNN accelerator chip,photorealistic virtual avatars,high visual performance,mobile SoCs,form factor constraints,near-sensor processing,system-level energy-efficiency,privacy features,prototype small-scale mobile SoC,energy-efficient performance,Codec Avatar model,technology node,Neural Network accelerator,on-chip SRAM,RISC-V CPU,prototype mobile VR headset,Codec Avatar application,system-level integration,hardware-aware model customization,circuit-level acceleration,Codec Avatar demonstration,Convolutional NN based eye gaze extraction model,off-chip memory accesses,low-power consumption,low form factors,full-stack design considerations,CNN model,prototype SoC,Multiply-Accumulate array,size 7.0 nm,memory size 2.0 MByte,word length 32.0 bit,power 22.7 mW,time 16.5 ms,energy 375.0 muJ | Conference | 0886-5930 |
ISBN | Citations | PageRank |
978-1-7281-8280-3 | 1 | 0.41 |
References | Authors | |
5 | 14 |
Name | Order | Citations | PageRank |
---|---|---|---|
H. Ekin Sumbul | 1 | 1 | 0.41 |
Tony F. Wu | 2 | 1 | 0.75 |
Yuecheng Li | 3 | 1 | 0.41 |
Syed Shakib Sarwar | 4 | 1 | 0.41 |
William Koven | 5 | 1 | 0.41 |
Eli Murphy-Trotzky | 6 | 1 | 0.41 |
Xingxing Cai | 7 | 1 | 0.41 |
Elnaz Ansari | 8 | 9 | 1.39 |
Daniel H. Morris | 9 | 1 | 0.41 |
Huichu Liu | 10 | 1 | 0.41 |
Doyun Kim | 11 | 30 | 4.82 |
Edith Beigne | 12 | 536 | 52.54 |
Reality Labs | 13 | 1 | 0.41 |
Meta | 14 | 1 | 0.41 |