Title
On-Demand Deep Model Compression for Mobile Devices: A Usage-Driven Model Selection Framework.
Abstract
Recent research has demonstrated the potential of deploying deep neural networks (DNNs) on resource-constrained mobile platforms by trimming down the network complexity using different compression techniques. The current practice only investigate stand-alone compression schemes even though each compression technique may be well suited only for certain types of DNN layers. Also, these compression techniques are optimized merely for the inference accuracy of DNNs, without explicitly considering other application-driven system performance (e.g. latency and energy cost) and the varying resource availabilities across platforms (e.g. storage and processing capability). In this paper, we explore the desirable tradeoff between performance and resource constraints by user-specified needs, from a holistic system-level viewpoint. Specifically, we develop a usage-driven selection framework, referred to as AdaDeep, to automatically select a combination of compression techniques for a given DNN, that will lead to an optimal balance between user-specified performance goals and resource constraints. With an extensive evaluation on five public datasets and across twelve mobile devices, experimental results show that AdaDeep enables up to 9.8x latency reduction, 4.3x energy efficiency improvement, and 38x storage reduction in DNNs while incurring negligible accuracy loss. AdaDeep also uncovers multiple effective combinations of compression techniques unexplored in existing literature.
Year
DOI
Venue
2018
10.1145/3210240.3210337
MobiSys '18: The 16th Annual International Conference on Mobile Systems, Applications, and Services Munich Germany June, 2018
Keywords
Field
DocType
deep learning,model compression,deep reinforcement learning
Network complexity,Efficient energy use,Computer science,Inference,Latency (engineering),Model selection,Real-time computing,Mobile device,Artificial intelligence,Deep learning,Trimming,Distributed computing
Conference
ISBN
Citations 
PageRank 
978-1-4503-5720-3
27
0.80
References 
Authors
25
6
Name
Order
Citations
PageRank
Sicong Liu121733.61
Ying-yan Lin210621.39
Zimu Zhou3115761.40
Kaiming Nan4270.80
Hui Liu5302.22
Junzhao Du613115.61