Title
Memory Scaling of Cloud-Based Big Data Systems: A Hybrid Approach
Abstract
When deploying applications with dynamic and intensive memory footprint to big data systems on public clouds, one important yet challenging question to answer is how to select a specific instance type whose memory capacity is large enough to prevent out-of-memory errors while the cost is minimized without violating performance requirements. The state-of-the-practice solution is trial and error, causing both performance overhead and additional monetary cost. This article investigates two memory scaling mechanisms in public clouds: physical memory (good performance and high cost) and virtual memory (degraded performance and no additional cost). In order to analyze the trade-off between performance and cost of the two scaling options, a performance-cost model is developed that is driven by a lightweight analytic prediction approach through a compact representation of the memory footprint. In addition, for those scenarios when the footprint is unavailable, a meta-model-based prediction method is proposed using just-in-time migration mechanisms. The proposed techniques have been extensively evaluated with various benchmarks and real-world applications on Amazon Web Services: the performance-cost model is highly accurate and the proposed just-in-time migration approach reduces the monetary cost by up to 66 percent.
Year
DOI
Venue
2022
10.1109/TBDATA.2020.3035522
IEEE Transactions on Big Data
Keywords
DocType
Volume
Memory management,resource allocation,cloud computing
Journal
8
Issue
ISSN
Citations 
5
2332-7790
0
PageRank 
References 
Authors
0.34
24
5
Name
Order
Citations
PageRank
Xinying Wang100.34
Cong Xu2115448.25
Ke Wang331313.66
Feng Yan416318.78
Dongfang Zhao536226.49