Title | ||
---|---|---|
Symmetric <inline-formula><tex-math notation="LaTeX">$k$</tex-math></inline-formula>-Means for Deep Neural Network Compression and Hardware Acceleration on FPGAs |
Abstract | ||
---|---|---|
Convolutional Neural Networks (CNNs) are popular models that have been successfully applied to diverse domains like vision, speech, and text. To reduce inference-time latency, it is common to employ hardware accelerators, which often require a model compression step. Contrary to most compression algorithms that are agnostic of the underlying hardware acceleration strategy, this paper introduces a novel Symmetric k-means based compression algorithm that is specifically designed to support a new FPGA-based hardware acceleration scheme by reducing the number of inference-time multiply-accumulate (MAC) operations by up to 98%. First, a simple k-means based training approach is presented and then as an extension, Symmetric k-means is proposed which yields twice the reduction in MAC operations for the same bit-depth as the simple k-means approach. A comparative analysis is conducted on popular CNN architectures for tasks including classification, object detection and end-to-end stereo matching on various datasets. For all tasks, the model compression down to 3 bits is presented, while no loss is observed in accuracy for the 5-bits quantization. |
Year | DOI | Venue |
---|---|---|
2020 | 10.1109/JSTSP.2020.2968810 | IEEE Journal of Selected Topics in Signal Processing |
Keywords | DocType | Volume |
Convolutional neural network (CNN),deep learning,k-means,quantization,model compression,FPGA,object detection,stereo | Journal | 14 |
Issue | ISSN | Citations |
4 | 1932-4553 | 1 |
PageRank | References | Authors |
0.40 | 0 | 5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Akshay Jain | 1 | 1 | 0.40 |
Pulkit Goel | 2 | 1 | 0.40 |
Shivam Aggarwal | 3 | 1 | 0.40 |
Alexander Fell | 4 | 66 | 8.66 |
Saket Anand | 5 | 87 | 9.36 |