Binarized Convolutional Neural Networks for Efficient Inference on GPUs. - Citegraph

Paper Info

Title
Binarized Convolutional Neural Networks for Efficient Inference on GPUs.

Abstract
Convolutional neural networks have recently achieved significant breakthroughs in various image classification tasks. However, they are computationally expensive, which can make their feasible implementation on embedded and low-power devices difficult. In this paper convolutional neural network binarization is implemented on GPU-based platforms for real-time inference on resource constrained devices. In binarized networks, all weights and intermediate computations between layers are quantized to +1 and -1, allowing multiplications and additions to be replaced with bit-wise operations between 32-bit words. This representation completely eliminates the need for floating point multiplications and additions and decreases both the computational load and the memory footprint compared to a full-precision network implemented in floating point, making it well-suited for resource-constrained environments. We compare the performance of our implementation with an equivalent floating point implementation on one desktop and two embedded GPU platforms. Our implementation achieves a maximum speed up of 7.4x with only 4.4% loss in accuracy compared to a reference implementation.

Year	DOI	Venue
2018	10.23919/EUSIPCO.2018.8553594	European Signal Processing Conference
Keywords	DocType	Volume
model compression,binarized convolutional neural networks,optimization,image classification	Conference	abs/1808.00209
ISSN	Citations	PageRank
2076-1465	0	0.34
References	Authors
0	3

Authors (3 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Mir Khan	1	0	0.34
Heikki Huttunen	2	244	28.20
Jani Boutellier	3	137	25.36

1