Abstract | ||
---|---|---|
This work presents an efficient CNN computation framework on FPGA, which utilizes Product Quantization (PQ). Compared to other compression methods, PQ has larger compression ratios and, furthermore, it alleviates the irregularity problem. However, its algorithmic benefits do not translate to system performance gains because of: 1) a large codebook that diminishes the compression ratio; 2) large numbers of look-up operations that are inefficient on CPU and GPU architectures. In this work, to address these problems, we first provide an analytical model to guide our design and find a dilemma for selecting PQ parameters. Then, we propose a software/hardware method to tackle these issues. We present a complete framework to optimally implement PQ-CNN on FPGA. According to our experimental results, we can achieve 140 Tops equivalent throughput, 475 Gops/w energy efficiency and with less than 0.5% accuracy degradation. |
Year | DOI | Venue |
---|---|---|
2018 | 10.1109/FCCM.2018.00041 | 2018 IEEE 26th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM) |
Keywords | Field | DocType |
CNN,FPGA,Product quantization | Convolutional neural network,Computer science,Efficient energy use,Parallel computing,Field-programmable gate array,Software,Compression ratio,Throughput,Computation,Codebook | Conference |
ISBN | Citations | PageRank |
978-1-5386-5523-8 | 1 | 0.35 |
References | Authors | |
1 | 2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Jialiang Zhang | 1 | 94 | 11.46 |
Jing Li | 2 | 208 | 30.49 |