Title
Symmetric <inline-formula><tex-math notation="LaTeX">$k$</tex-math></inline-formula>-Means for Deep Neural Network Compression and Hardware Acceleration on FPGAs
Abstract
Convolutional Neural Networks (CNNs) are popular models that have been successfully applied to diverse domains like vision, speech, and text. To reduce inference-time latency, it is common to employ hardware accelerators, which often require a model compression step. Contrary to most compression algorithms that are agnostic of the underlying hardware acceleration strategy, this paper introduces a novel Symmetric k-means based compression algorithm that is specifically designed to support a new FPGA-based hardware acceleration scheme by reducing the number of inference-time multiply-accumulate (MAC) operations by up to 98%. First, a simple k-means based training approach is presented and then as an extension, Symmetric k-means is proposed which yields twice the reduction in MAC operations for the same bit-depth as the simple k-means approach. A comparative analysis is conducted on popular CNN architectures for tasks including classification, object detection and end-to-end stereo matching on various datasets. For all tasks, the model compression down to 3 bits is presented, while no loss is observed in accuracy for the 5-bits quantization.
Year
DOI
Venue
2020
10.1109/JSTSP.2020.2968810
IEEE Journal of Selected Topics in Signal Processing
Keywords
DocType
Volume
Convolutional neural network (CNN),deep learning,k-means,quantization,model compression,FPGA,object detection,stereo
Journal
14
Issue
ISSN
Citations 
4
1932-4553
1
PageRank 
References 
Authors
0.40
0
5
Name
Order
Citations
PageRank
Akshay Jain110.40
Pulkit Goel210.40
Shivam Aggarwal310.40
Alexander Fell4668.66
Saket Anand5879.36