Title
High-Speed Power-Efficient Coarse-Grained Convolver Architecture using Depth-First Compression Scheme
Abstract
Convolutional neural networks (CNNs) have been playing an important role in various applications, e.g., computer vision. Since CNN computations require numerous multiply-accumulate (MAC) operations, how to get them done efficiently is a crucial issue for CNN hardware accelerators. In this paper, we propose a high-speed power-efficient convolver architecture for CNN acceleration. A 3×3 convolver is asked to produce an output every cycle and is commonly accomplished by summing up the results of nine parallel multiplications, which requires ten carry-propagation adders (CPAs) in total. However, the proposed coarse-grained convolver can break the boundary between multipliers and reduce all partial products in a more global way. Consequently, it requires only one CPA to generate the final outcome. It also features a globally delay-optimized partial product reduction tree and a depth-first compression scheme for both area and power minimization. The proposed convolver has been implemented using TSMC 40nm technology. Compared to a conventional 3×3 convolver baseline design, our design can reduce area and power by 15.8% and 26.5% respectively at the clock rate of 1GHz.
Year
DOI
Venue
2020
10.1109/ISCAS45731.2020.9180406
2020 IEEE International Symposium on Circuits and Systems (ISCAS)
Keywords
DocType
ISBN
Convolvers,Delays,Computer architecture,Pipelines,Minimization,Hardware
Conference
978-1-7281-3320-1
Citations 
PageRank 
References 
0
0.34
0
Authors
3
Name
Order
Citations
PageRank
Yi-Lin Wu100.68
Yi Lu200.68
Juinn-Dar Huang327027.42