Title
PQ-CNN: Accelerating Product Quantized Convolutional Neural Network on FPGA
Abstract
This work presents an efficient CNN computation framework on FPGA, which utilizes Product Quantization (PQ). Compared to other compression methods, PQ has larger compression ratios and, furthermore, it alleviates the irregularity problem. However, its algorithmic benefits do not translate to system performance gains because of: 1) a large codebook that diminishes the compression ratio; 2) large numbers of look-up operations that are inefficient on CPU and GPU architectures. In this work, to address these problems, we first provide an analytical model to guide our design and find a dilemma for selecting PQ parameters. Then, we propose a software/hardware method to tackle these issues. We present a complete framework to optimally implement PQ-CNN on FPGA. According to our experimental results, we can achieve 140 Tops equivalent throughput, 475 Gops/w energy efficiency and with less than 0.5% accuracy degradation.
Year
DOI
Venue
2018
10.1109/FCCM.2018.00041
2018 IEEE 26th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)
Keywords
Field
DocType
CNN,FPGA,Product quantization
Convolutional neural network,Computer science,Efficient energy use,Parallel computing,Field-programmable gate array,Software,Compression ratio,Throughput,Computation,Codebook
Conference
ISBN
Citations 
PageRank 
978-1-5386-5523-8
1
0.35
References 
Authors
1
2
Name
Order
Citations
PageRank
Jialiang Zhang19411.46
Jing Li220830.49