Mixed-Signal Charge-Domain Acceleration of Deep Neural networks through Interleaved Bit-Partitioned Arithmetic. - Citegraph

Paper Info

Title
Mixed-Signal Charge-Domain Acceleration of Deep Neural networks through Interleaved Bit-Partitioned Arithmetic.

Abstract
Albeit low-power, mixed-signal circuitry suffers from significant overhead of Analog to Digital (A/D) conversion, limited range for information encoding, and susceptibility to noise. This paper aims to address these challenges by offering and leveraging the following mathematical insight regarding vector dot-product---the basic operator in Deep Neural Networks (DNNs). This operator can be reformulated as a wide regrouping of spatially parallel low-bitwidth calculations that are interleaved across the bit partitions of multiple elements of the vectors. As such, the computational building block of our accelerator becomes a wide bit-interleaved analog vector unit comprising a collection of low-bitwidth multiply-accumulate modules that operate in the analog domain and share a single A/D converter(ADC). This bit-partitioning results in a lower-resolution ADC while the wide regrouping alleviates the need for A/D conversion per operation, amortizing its cost across multiple bit-partitions of the vector elements. Moreover, the low-bitwidth modules require smaller encoding range and also provide larger margins for noise mitigation. We also utilize the switched-capacitor design for our bit-level reformulation of DNN operations. The proposed switched-capacitor circuitry performs the regrouped multiplications in the charge domain and accumulates the results of the group in its capacitors over multiple cycles. The capacitive accumulation combined with wide bit-partitioned regrouping reduces the rate of A/D conversions, further improving the overall efficiency of the design. With such mathematical reformulation and its switched-capacitor implementation, we define one possible 3D-stacked microarchitecture, dubbed BiHiwe, that leverages clustering and hierarchical design to best utilize power-efficiency of the mixed-signal domain and 3D stacking. We also build models for noise, computational non-idealities, and variations. For ten DNN benchmarks, BiHiwe delivers 5.5x speedup over a leading purely-digital 3D-stacked accelerator Tetris, with a mere of less than 0.5% accuracy loss achieved by careful treatment of noise, computation error, and various forms of variation. Compared to RTX~2080~TI with tensor cores and Titan Xp GPUs, all with 8-bit execution, BiHiwe offers 35.4x and 70.1x higher Performance-per-Watt, respectively. Relative to the mixed-signal RedEye, ISAAC, and PipeLayer, BiHiwe offers 5.5x, 3.6x, and 9.6x improvement in Performance-per-Watt respectively. The results suggest that BiHiwe is an effective initial step in a road that combines mathematics, circuits, and architecture.

Year	DOI	Venue
2020	10.1145/3410463.3414634	PACT '20: International Conference on Parallel Architectures and Compilation Techniques Virtual Event GA USA October, 2020
DocType	Volume	ISBN
Conference	abs/1906.11915	978-1-4503-8075-1
Citations	PageRank	References
2	0.35	43
Authors
8

Authors (8 rows)

Cited by (2 rows)

References (43 rows)

Name	Order	Citations	PageRank
Soroush Ghodrati	1	13	1.94
Hardik Sharma	2	86	3.00
Sean Kinzer	3	7	2.14
Amir Yazdanbakhsh	4	241	15.28
Kambiz Samadi	5	817	43.11
Nam Sung Kim	6	3268	225.99
Doug Burger	7	6160	491.08
H. Esmaeilzadeh	8	1443	69.71

1