Title
Quantized NNs as the definitive solution for inference on low-power ARM MCUs?: work-in-progress
Abstract
High energy efficiency and low memory footprint are the key requirements for the deployment of deep learning based analytics on low-power microcontrollers. Here we present work-in-progress results with Q-bit Quantized Neural Networks (QNNs) deployed on a commercial Cortex-M7 class microcontroller by means of an extension to the ARM CMSIS-NN library. We show that i) forQ = 4 andQ = 2 low memory footprint QNNs can be deployed with an energy overhead of 30% and 36% respectively against the 8-bit CMSIS-NN due to the lack of quantization support in the ISA; ii) forQ = 1 native instructions can be used, yielding an energy and latency reduction of ~3.8× with respect to CMSIS-NN. Our initial results suggest that a small set of QNN-related specialized instructions could improve performance by as much as 7.5× forQ = 4, 13.6× forQ = 2 and 6.5× for binary NNs.
Year
Venue
Field
2018
ESWEEK '18: Fourteenth Embedded Systems Week Turin Italy September, 2018
Latency (engineering),Work in process,Computer science,Real-time computing,Microcontroller,Artificial intelligence,Deep learning,Quantization (signal processing),Memory footprint,Analytics,Computer engineering,Binary number
DocType
ISBN
Citations 
Conference
978-1-5386-5562-7
0
PageRank 
References 
Authors
0.34
4
4
Name
Order
Citations
PageRank
Manuele Rusci121.47
Alessandro Capotondi2398.25
Francesco Conti 0001312518.24
Luca Benini4131161188.49