Abstract | ||
---|---|---|
In this invited paper, we present deep neural network (DNN) training accelerator designs in both ASIC and FPGA. The accelerators implements stochastic gradient descent based training algorithm in 16-bit fixed-point precision. A new cyclic weight storage and access scheme enables using the same off-the-shelf SRAMs for non-transpose and transpose operations during feed-forward and feed-backward phases, respectively, of the DNN training process. Including the cyclic weight scheme, the overall DNN training processor is implemented in both 65 nm CMOS ASIC and Intel Stratix-10 FPGA hardware. We collectively report the ASIC and FPGA training accelerator results. |
Year | DOI | Venue |
---|---|---|
2020 | 10.1109/ISOCC50952.2020.9333063 | 2020 International SoC Design Conference (ISOCC) |
Keywords | DocType | ISSN |
on-device training,convolutional neural networks,hardware accelerator,energy efficiency | Conference | 2163-9612 |
ISBN | Citations | PageRank |
978-1-7281-8332-9 | 1 | 0.35 |
References | Authors | |
0 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Shreyas K. Venkataramanaiah | 1 | 3 | 1.39 |
Shihui Yin | 2 | 71 | 10.03 |
Yu Cao | 3 | 329 | 29.78 |
Jae-sun Seo | 4 | 536 | 56.32 |