Title
Powering Multi-Task Federated Learning with Competitive GPU Resource Sharing
Abstract
BSTRACT Federated learning (FL) nowadays involves compound learning tasks as cognitive applications’ complexity increases. For example, a self-driving system hosts multiple tasks simultaneously (e.g., detection, classification, etc.) and expects FL to retain life-long intelligence involvement. However, our analysis demonstrates that, when deploying compound FL models for multiple training tasks on a GPU, certain issues arise: (1) As different tasks’ skewed data distributions and corresponding models cause highly imbalanced learning workloads, current GPU scheduling methods lack effective resource allocations; (2) Therefore, existing FL schemes, only focusing on heterogeneous data distribution but runtime computing, cannot practically achieve optimally synchronized federation. To address these issues, we propose a full-stack FL optimization scheme to address both intra-device GPU scheduling and inter-device FL coordination for multi-task training. Specifically, our works illustrate two key insights in this research domain: (1) Competitive resource sharing is beneficial for parallel model executions, and the proposed concept of “virtual resource” could effectively characterize and guide the practical per-task resource utilization and allocation. (2) FL could be further improved by taking architectural level coordination into consideration. Our experiments demonstrate that the FL throughput could be significantly escalated.
Year
DOI
Venue
2022
10.1145/3487553.3524859
International World Wide Web Conference
DocType
Citations 
PageRank 
Conference
0
0.34
References 
Authors
0
9
Name
Order
Citations
PageRank
Yongbo Yu100.34
Fuxun Yu200.34
Zirui Xu300.34
Di Wang41337143.48
Minjia Zhang500.34
Ang Li650136.38
Shawn Bray700.34
Chenchen Liu89017.45
Chen Xiang93135.72