Dual Scaling VMs and Queries: Cost-Effective Latency Curtailment - Citegraph

Paper Info

Title
Dual Scaling VMs and Queries: Cost-Effective Latency Curtailment

Abstract
Wimpy virtual instances equipped with small numbers of cores and RAM are popular public and private cloud offerings because of their low cost for hosting applications. The challenge is how to run latency-sensitive applications using such instances, which trade off performance for cost. In this study, we analytically and experimentally show that simultaneously scaling resources at coarse granularity and workloads, i.e., submitting multiple query clones to different servers, at fine granularity can overcome the performance disadvantages of wimpy VM instances and achieve stringent latency targets that are even lower than the average execution times of wimpy servers. To such an end, we first derive a closed-form analysis for the latency under any given VM provisioning and query replication level, considering cloning policies that can (not) terminate outstanding clones with (without) an overhead. Validated on trace-driven simulations, our analysis is able to accurately predict the latency and efficiently search for the optimal number of VMs and clones. Secondly, we develop a dual elastic scaler, DuoScale, that dynamically scales VMs and clones according to the workload dynamics so as to achieve the target latency in a cost-effective manner. The effectiveness of DuoScale lies on the observation that the application performance only scales sub-linearly with increasing vertical or horizontal resource provisioning, i.e., resources per VM or number of VMs. We evaluate DuoScale against VM-only scaling strategies via extensive trace-driven simulations as well as experimental results on a cloud test-bed. Our results show that DuoScale is able to achieve the stringent target latency by using clones on wimpy VMs with cost savings up to 50%, compared to scaling brawny VMs that have better performance at a higher unit cost.

Year	DOI	Venue
2017	10.1109/ICDCS.2017.231	2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS)
Keywords	Field	DocType
dual scaling VM,cost-effective latency curtailment,wimpy virtual instances,public cloud offerings,private cloud offerings,hosting applications,latency-sensitive applications,multiple query clones,closed-form analysis,trace-driven simulations,dual elastic scaler,DuoScale,workload dynamics,vertical resource provisioning,horizontal resource provisioning	Computer science,Workload,Latency (engineering),Server,Parallel computing,Unit cost,Computer network,Provisioning,Granularity,Scaling,Cloud computing,Distributed computing	Conference
ISSN	ISBN	Citations
1063-6927	978-1-5386-1793-9	0
PageRank	References	Authors
0.34	23	4

Authors (4 rows)

Cited by (0 rows)

References (23 rows)

Name	Order	Citations	PageRank
Juan F. Pérez	1	106	11.80
Robert Birke	2	133	15.51
Mathias Bjorkqvist	3	96	8.85
Lydia Y. Chen	4	432	52.24

1