Title
Learning Driven Parallelization For Large-Scale Video Workload In Hybrid Cpu-Gpu Cluster
Abstract
Hybrid CPU-GPU cluster has become a promising computing paradigm for large-scale video analytics. However, the uncertainty and variability of workloads and heterogeneous resources in the cluster can lead to the unbalanced use of the hybrid computing resources and further cause the performance degradation of the computing platform. This problem becomes more challenging with the computation complexity and dependencies of video tasks in the hybrid cluster. In this paper, we focus on the video workload parallelization problem with fine-grained task division and feature description in the hybrid CPU-GPU cluster. Firstly, for achieving high resource utilization and task throughput, we propose a two-stage video task scheduling approach based on deep reinforcement learning. In our approach, a task execution node is selected by the cluster-level scheduler for the mutually independent video tasks, and then the node-level scheduler assigns the interrelated video subtasks to the appropriate computing units. By using the deep Q-network, the two-stage scheduling model is online learned to perform the current optimal scheduling actions according to the runtime status of cluster environments, the characteristics of video tasks, and the dependencies between video tasks. Secondly, based on the transfer learning technology, a scheduling strategy generalization method is proposed to efficiently rebuild the task scheduling model referring to the existing model. Finally, we conduct the extensive experiments to analyze the impact of the model parameters on the scheduling actions, and then the experimental results also validate that our learning based task scheduling approach outperforms the other widely used methods.
Year
DOI
Venue
2018
10.1145/3225058.3225070
PROCEEDINGS OF THE 47TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING
Keywords
Field
DocType
Heterogeneous computing, video processing, task scheduling, deep reinforcement learning, transfer learning
Video processing,GPU cluster,Workload,Scheduling (computing),Computer science,Parallel computing,Transfer of learning,Symmetric multiprocessor system,Analytics,Distributed computing,Reinforcement learning
Conference
ISSN
Citations 
PageRank 
0190-3918
1
0.34
References 
Authors
20
4
Name
Order
Citations
PageRank
Hai-Tao Zhang18314.27
Bingchang Tang210.34
Xin Geng3155783.54
Huadong Ma42020179.93