High-Speed Power-Efficient Coarse-Grained Convolver Architecture using Depth-First Compression Scheme - Citegraph

Paper Info

Title
High-Speed Power-Efficient Coarse-Grained Convolver Architecture using Depth-First Compression Scheme

Abstract
Convolutional neural networks (CNNs) have been playing an important role in various applications, e.g., computer vision. Since CNN computations require numerous multiply-accumulate (MAC) operations, how to get them done efficiently is a crucial issue for CNN hardware accelerators. In this paper, we propose a high-speed power-efficient convolver architecture for CNN acceleration. A 3×3 convolver is asked to produce an output every cycle and is commonly accomplished by summing up the results of nine parallel multiplications, which requires ten carry-propagation adders (CPAs) in total. However, the proposed coarse-grained convolver can break the boundary between multipliers and reduce all partial products in a more global way. Consequently, it requires only one CPA to generate the final outcome. It also features a globally delay-optimized partial product reduction tree and a depth-first compression scheme for both area and power minimization. The proposed convolver has been implemented using TSMC 40nm technology. Compared to a conventional 3×3 convolver baseline design, our design can reduce area and power by 15.8% and 26.5% respectively at the clock rate of 1GHz.

Year	DOI	Venue
2020	10.1109/ISCAS45731.2020.9180406	2020 IEEE International Symposium on Circuits and Systems (ISCAS)
Keywords	DocType	ISBN
Convolvers,Delays,Computer architecture,Pipelines,Minimization,Hardware	Conference	978-1-7281-3320-1
Citations	PageRank	References
0	0.34	0
Authors
3

Authors (3 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Yi-Lin Wu	1	0	0.68
Yi Lu	2	0	0.68
Juinn-Dar Huang	3	270	27.42

1