Title | ||
---|---|---|
Quantized NNs as the definitive solution for inference on low-power ARM MCUs?: work-in-progress |
Abstract | ||
---|---|---|
High energy efficiency and low memory footprint are the key requirements for the deployment of deep learning based analytics on low-power microcontrollers. Here we present work-in-progress results with Q-bit Quantized Neural Networks (QNNs) deployed on a commercial Cortex-M7 class microcontroller by means of an extension to the ARM CMSIS-NN library. We show that i) forQ = 4 andQ = 2 low memory footprint QNNs can be deployed with an energy overhead of 30% and 36% respectively against the 8-bit CMSIS-NN due to the lack of quantization support in the ISA; ii) forQ = 1 native instructions can be used, yielding an energy and latency reduction of ~3.8× with respect to CMSIS-NN. Our initial results suggest that a small set of QNN-related specialized instructions could improve performance by as much as 7.5× forQ = 4, 13.6× forQ = 2 and 6.5× for binary NNs.
|
Year | Venue | Field |
---|---|---|
2018 | ESWEEK '18: Fourteenth Embedded Systems Week
Turin
Italy
September, 2018 | Latency (engineering),Work in process,Computer science,Real-time computing,Microcontroller,Artificial intelligence,Deep learning,Quantization (signal processing),Memory footprint,Analytics,Computer engineering,Binary number |
DocType | ISBN | Citations |
Conference | 978-1-5386-5562-7 | 0 |
PageRank | References | Authors |
0.34 | 4 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Manuele Rusci | 1 | 2 | 1.47 |
Alessandro Capotondi | 2 | 39 | 8.25 |
Francesco Conti 0001 | 3 | 125 | 18.24 |
Luca Benini | 4 | 13116 | 1188.49 |