Title
AutoML for Architecting Efficient and Specialized Neural Networks
Abstract
Efficient deep learning inference requires algorithm and hardware codesign to enable specialization: we usually need to change the algorithm to reduce memory footprint and improve energy efficiency. However, the extra degree of freedom from the neural architecture design makes the design space much larger: it is not only about designing the hardware architecture but also codesigning the neural architecture to fit the hardware architecture. It is difficult for human engineers to exhaust the design space by heuristics. We propose design automation techniques for architecting efficient neural networks given a target hardware platform. We investigate automatically designing specialized and fast models, auto channel pruning, and auto mixed-precision quantization. We demonstrate that such learning-based, automated design achieves superior performance and efficiency than the rule-based human design. Moreover, we shorten the design cycle by 200× than previous work, so that we can afford to design specialized neural network models for different hardware platforms.
Year
DOI
Venue
2020
10.1109/MM.2019.2953153
IEEE Micro
Keywords
Field
DocType
AutoML,Neural Architecture Search,Channel Pruning,Mixed-Precision,Quantization,Specialization,Efficient Inference
Computer science,Parallel computing,Artificial neural network,Distributed computing
Journal
Volume
Issue
ISSN
40
1
0272-1732
Citations 
PageRank 
References 
2
0.37
0
Authors
8
Name
Order
Citations
PageRank
Han Cai122310.39
Lin, Ji2798.18
Yujun Lin31017.03
Zhijian Liu4599.80
Kuan Wang5453.06
Tianzhe Wang6101.79
Ligen Zhu7835.19
Song Han8210279.81