ADMM-based Weight Pruning for Real-Time Deep Learning Acceleration on Mobile Devices - Citegraph

Paper Info

Title
ADMM-based Weight Pruning for Real-Time Deep Learning Acceleration on Mobile Devices

Abstract
Deep learning solutions are being increasingly deployed in mobile applications, at least for the inference phase. Due to the large model size and computational requirements, model compression for deep neural networks (DNNs) becomes necessary, especially considering the real-time requirement in embedded systems. In this paper, we extend the prior work on systematic DNN weight pruning using ADMM (Alternating Direction Method of Multipliers). We integrate ADMM regularization with masked mapping/retraining, thereby guaranteeing solution feasibility and providing high solution quality. Besides superior performance on representative DNN benchmarks (e.g., AlexNet, ResNet), we focus on two new applications facial emotion detection and eye tracking, and develop a top-down framework of DNN training, model compression, and acceleration in mobile devices. Experimental results show that with negligible accuracy degradation, the proposed method can achieve significant storage/memory reduction and speedup in mobile devices.

Year	DOI	Venue
2019	10.1145/3299874.3319492	Proceedings of the 2019 on Great Lakes Symposium on VLSI
Keywords	Field	DocType
acceleration, mobile devices, neural networks, real-time	Computer science,Inference,Real-time computing,Mobile device,Regularization (mathematics),Eye tracking,Acceleration,Artificial intelligence,Deep learning,Artificial neural network,Speedup	Conference
ISSN	ISBN	Citations
1066-1395	978-1-4503-6252-8	2
PageRank	References	Authors
0.43	0	9

Authors (9 rows)

Cited by (2 rows)

References (0 rows)

Name	Order	Citations	PageRank
Hongjia Li	1	7	5.91
Ning Liu	2	15	3.59
Xiaolong Ma	3	22	5.90
Sheng Lin	4	139	14.39
Shaokai Ye	5	38	6.53
Tianyun Zhang	6	31	6.42
Xue Lin	7	86	14.97
Wenyao Xu	8	615	77.06
Yanzhi Wang	9	1082	136.11

1