Title
Scrooge: A Cost-Effective Deep Learning Inference System
Abstract
ABSTRACTAdvances in deep learning (DL) have prompted the development of cloud-hosted DL-based media applications that process video and audio streams in real-time. Such applications must satisfy throughput and latency objectives and adapt to novel types of dynamics, while incurring minimal cost. Scrooge, a system that provides media applications as a service, achieves these objectives by packing computations efficiently into GPU-equipped cloud VMs, using an optimization formulation to find the lowest cost VM allocations that meet the performance objectives, and rapidly reacting to variations in input complexity (e.g., changes in participants in a video). Experiments show that Scrooge can save serving cost by 16-32% (which translate to tens of thousands of dollars per year) relative to the state-of-the-art while achieving latency objectives for over 98% under dynamic workloads.
Year
DOI
Venue
2021
10.1145/3472883.3486993
International Conference on Management of Data
Keywords
DocType
Citations 
Cloud computing, deep learning inference, auto-scaling
Conference
0
PageRank 
References 
Authors
0.34
0
3
Name
Order
Citations
PageRank
Yitao Hu100.68
Rajrup Ghosh200.34
ramesh govindan3154302144.86