Title
Inference Acceleration with Adaptive Distributed DNN Partition over Dynamic Video Stream
Abstract
Deep neural network-based computer vision applications have exploded and are widely used in intelligent services for IoT devices. Due to the computationally intensive nature of DNNs, the deployment and execution of intelligent applications in smart scenarios face the challenge of limited device resources. Existing job scheduling strategies are single-focused and have limited support for large-scale end-device scenarios. In this paper, we present ADDP, an adaptive distributed DNN partition method that supports video analysis on large-scale smart cameras. ADDP applies to the commonly used DNN models for computer vision and contains a feature-map layer partition module (FLP) supporting edge-to-end collaborative model partition and a feature-map size partition (FSP) module supporting multidevice parallel inference. Based on the inference delay minimization objective, FLP and FSP achieve a tradeoff between the arithmetic and communication resources of different devices. We validate ADDP on heterogeneous devices and show that both the FLP module and the FSP module outperform existing approaches and reduce single-frame response latency by 10-25% compared to the pure on-device processing.
Year
DOI
Venue
2022
10.3390/a15070244
ALGORITHMS
Keywords
DocType
Volume
edge computing, deep learning, distributed AI computing, large-scale video analytics
Journal
15
Issue
ISSN
Citations 
7
1999-4893
0
PageRank 
References 
Authors
0.34
0
4
Name
Order
Citations
PageRank
Jin Cao100.34
Bo Li288.65
Mengni Fan300.34
Huiyu Liu441.49