Title
Resource-Aware Optimization of DNNs for Embedded Applications
Abstract
Despite their outstanding success in solving complex computer vision problems, Deep Neural Networks (DNNs) still require high-performance hardware for real-time inference. Therefore they are not applicable to low-cost embedded hardware, where memory resources, computational performance and power consumption are restricted. Furthermore, current approaches of fitting neural networks to embedded hardware are time consuming, inducing slow development cycles. To address these drawbacks and satisfy the demands of embedded hardware, this paper proposes a computationally efficient magnitude-based pruning scheme, based on a half-interval search, combined with effective weight sharing, fixed-point quantization, and lossless compression. The proposed solution can be utilized to generate an optimized model, either with respect to memory demand or execution time. For instance, the memory demand of LeNet is compressed about 385×. VGG16 is pruned by about 14.5×, whilst its computational costs are reduced by about 1.6× for a CPU-based application and 4.8× for an FPGA one.
Year
DOI
Venue
2019
10.1109/CRV.2019.00011
2019 16th Conference on Computer and Robot Vision (CRV)
Keywords
Field
DocType
Pruning,Quantization,Weight-Sharing,Coding,Embedded HW,CNN
Computer vision,Inference,Computer science,Field-programmable gate array,Embedded applications,Artificial intelligence,Execution time,Artificial neural network,Quantization (signal processing),Computer engineering,Lossless compression,Power consumption
Conference
ISBN
Citations 
PageRank 
978-1-7281-1839-0
0
0.34
References 
Authors
5
3
Name
Order
Citations
PageRank
Alexander Frickenstein143.53
Christian Unger272.59
Walter Stechele336552.77