Abstract | ||
---|---|---|
Convolutional neural networks have recently achieved significant breakthroughs in various image classification tasks. However, they are computationally expensive, which can make their feasible implementation on embedded and low-power devices difficult. In this paper convolutional neural network binarization is implemented on GPU-based platforms for real-time inference on resource constrained devices. In binarized networks, all weights and intermediate computations between layers are quantized to +1 and -1, allowing multiplications and additions to be replaced with bit-wise operations between 32-bit words. This representation completely eliminates the need for floating point multiplications and additions and decreases both the computational load and the memory footprint compared to a full-precision network implemented in floating point, making it well-suited for resource-constrained environments. We compare the performance of our implementation with an equivalent floating point implementation on one desktop and two embedded GPU platforms. Our implementation achieves a maximum speed up of 7.4x with only 4.4% loss in accuracy compared to a reference implementation. |
Year | DOI | Venue |
---|---|---|
2018 | 10.23919/EUSIPCO.2018.8553594 | European Signal Processing Conference |
Keywords | DocType | Volume |
model compression,binarized convolutional neural networks,optimization,image classification | Conference | abs/1808.00209 |
ISSN | Citations | PageRank |
2076-1465 | 0 | 0.34 |
References | Authors | |
0 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Mir Khan | 1 | 0 | 0.34 |
Heikki Huttunen | 2 | 244 | 28.20 |
Jani Boutellier | 3 | 137 | 25.36 |