A Tri-State Weight Convolutional Neural Network for an FPGA: Applied to YOLOv2 Object Detector - Citegraph

Paper Info

Title
A Tri-State Weight Convolutional Neural Network for an FPGA: Applied to YOLOv2 Object Detector

Abstract
A frame object detection, such as the YOLO (You only look once), is used in embedded vision systems, such as a robot, an automobile, a security camera, and a drone. However, it requires highly performance-per-power detection by an inexpensive device. In the paper, we propose a tri-state weight CNN, which is a generalization of a low-precision and sparse (pruning) for CNN weight. In the former part, we set a weight {-1,0,+1} as a ternary CNN, while in the latter part, we set a {-w,0,+w} as a sparse weight CNN. The proposed tri-state CNN is a kind of a mixed-precision one, which is suitable for an object detector consisting of a bounding box prediction (regression) and a class estimation (classification). We apply an indirect memory access architecture to skip zero part and propose the weight parallel 2D convolutional circuit. It can efficiently be applied to the AlexNet based CNN, which has different size kernels. We design the AlexNet based YOLOv2 to reduce the number of layers toward low-latency computation. In the experiment, the proposed tri-state scheme CNN reduces the memory size for weight by 92%. We implement the proposed tri-state weight YOLOv2 on the AvNet Inc. UltraZed-EG starter kit, which has the Xilinx Inc. Zynq Ultrascale+ MPSoC ZU3EG. It archived 61.70 frames per second (FPS), which exceeds the standard video frame rate (29.97 FPS). Compared with the ARM Cortex-A57, it was 268.2 times faster, and its performance per power efficiency was 313.51 times better. Also, compared with the NVidia Pascal embedded GPU, it was 4.0 times faster, and its power performance efficiency was 11.35 times better.

Year	DOI	Venue
2018	10.1109/FPT.2018.00058	2018 International Conference on Field-Programmable Technology (FPT)
Keywords	Field	DocType
FPGA,Object Detection,Deep Learning,Embedded System	Object detection,Computer science,Convolutional neural network,Parallel computing,Field-programmable gate array,Computational science,Artificial intelligence,Frame rate,Deep learning,Detector,MPSoC,Minimum bounding box	Conference
ISBN	Citations	PageRank
978-1-7281-0215-3	0	0.34
References	Authors
0	3

Authors (3 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Hiroki Nakahara	1	155	37.34
Masayuki Shimoda	2	8	6.45
Shimpei Sato	3	43	13.03

1