Abstract | ||
---|---|---|
Convolutional neural network (CNN) has a high recognition rate in image recognition and are used in embedded systems such as smartphones, robots and self-driving cars. Low-end FPGAs are candidates for embedded image recognition platforms because they achieve real-time performance at a low cost. However, CNN has significant parameters called weights and internal data called feature maps, which pose a challenge for FPGAs for performance and memory capacity. To solve these problems, we exploit a split-CNN and weight sparseness. The split-CNN reduces the memory footprint by splitting the feature map into smaller patches and allows the feature map to be stored in the FPGA's high-throughput on-chip memory. Weight sparseness reduces computational costs and achieves even higher performance. We designed a dedicated architecture of a sparse CNN and a memory buffering scheduling for a split-CNN and implemented this on the PYNQ-Z1 FPGA board with a low-end FPGA. An experiment on classification using VGG16 shows that our implementation is 3.1 times faster than the GPU, and 5.4 times faster than an existing FPGA implementation. |
Year | DOI | Venue |
---|---|---|
2021 | 10.1587/transinf.2021PAP0011 | IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS |
Keywords | DocType | Volume |
CNN, sparse CNN, embedded system, FPGA | Journal | E104D |
Issue | ISSN | Citations |
12 | 1745-1361 | 0 |
PageRank | References | Authors |
0.34 | 0 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Akira Jinguji | 1 | 0 | 0.34 |
Shimpei Sato | 2 | 0 | 0.34 |
Hiroki Nakahara | 3 | 155 | 37.34 |