Title
DCT-RAM: A Driver-Free Process-In-Memory 8T SRAM Macro with Multi-Bit Charge-Domain Computation and Time-Domain Quantization
Abstract
Process-In-Memory (PIM) is a promising solution to alleviating the memory-wall bottleneck in memory-intensive applications like CNNs. Recent demonstrations of SRAM-based PIM designs, particularly those computing in the charge domain [1]–[5], have greatly improved the linearity of analog multiply-and-add computations (MAC) and quantization, and their robustness to process variations, making their inference accuracy approach that of digital hardware in practical computer vision benchmarks such as CIFAR-10. However, there remain several limitations towards large scale integration of PIM macros, especially the assumptions on the availability of powerful external reference voltage drivers and the lack of scaling friendly designs. More specifically, high-bandwidth analog buffers driving large output load are necessary to distribute the massive number of analog signals (e.g. DAC outputs) across the macro, without sacrificing signal fidelity and computing speed. [10] is one work that reports its DAC drivers occupying 11.4% of the macro area and incurring 94-pJ energy overhead in 28 nm, accounting for 68.5% of the total energy in a macro supporting <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$5\mathrm{b}$</tex> activations and <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$8\mathrm{b}$</tex> weight. Second, SAR ADCs are popular for the common 5–9 bit resolution range. High-speed power-hungry analog buffers are required in conventional SAR ADCs to drive the capacitive DACs (CDACs) to reference voltages, with short settling time and high accuracy. Given the hundreds of ADCs in each macro, the design complexity and overheads incurred by these drivers are dominant. Our simulated reference driver takes 2.9-pJ energy in 65 nm, which is comparable to an ADC (e.g. 3.56 <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$\text{pJ}$</tex> in [12]). Third, it is challenging to fit any conventional <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$\geq 7\mathrm{b}$</tex> SAR ADC into the narrow width of SRAM cells due to the bulky CDACs and layout matching requirements, ultimately limiting the computing parallelism and energy amortization.
Year
DOI
Venue
2022
10.1109/CICC53496.2022.9772826
2022 IEEE Custom Integrated Circuits Conference (CICC)
Keywords
DocType
ISSN
layout matching requirements,bulky CDACs,CDACs,capacitive DACs,digital hardware,process variations,MAC,analog multiply-and-add computations,CNNs,external reference voltage drivers,energy amortization,SRAM cells,simulated reference driver,design complexity,reference voltages,conventional SAR ADCs,high-speed power-hungry analog buffers,analog signals,high-bandwidth analog buffers,PIM macros,scale integration,CIFAR-10,practical computer vision benchmarks,inference accuracy approach,charge domain,SRAM-based PIM designs,memory-intensive applications,memory-wall bottleneck,time-domain quantization,multibit charge-domain computation,Driver-Free Process-In-Memory 8T SRAM Macro,DCT-RAM,word length 5.0 bit to 9.0 bit,size 28.0 nm,size 65.0 nm,energy 3.56 pJ,energy 94 pJ,energy 2.9 pJ
Conference
0886-5930
ISBN
Citations 
PageRank 
978-1-7281-8280-3
0
0.34
References 
Authors
7
5
Name
Order
Citations
PageRank
Zhiyu Chen100.34
Qing Jin222.11
Zhanghao Yu300.34
Yanzhi Wang41082136.11
Kuiyuan Yang514820.89