Title
IPSO: A Scaling Model for Data-Intensive Applications
Abstract
Today's data center applications are predominantly data-intensive, calling for scaling out the workload to a large number of servers for parallel processing. Unfortunately, the existing scaling laws, notably, Amdahl's and Gustafson's laws are inadequate to characterize the scaling properties of dataintensive workloads. To fill this void, in this paper, we put forward a new scaling model, called In-Proportion and Scale-Out-induced scaling model (IPSO). IPSO generalizes the existing scaling models in two important aspects. First, it accounts for the possible in-proportion scaling, i.e., the scaling of the serial portion of the workload in proportion to the scaling of the parallelizable portion of the workload. Second, it takes into account the possible scaleout-induced scaling, i.e., the scaling of the collective overhead or workload induced by scaling out. IPSO exposes scaling properties of data-intensive workloads, rendering the existing scaling laws its special cases. In particular, IPSO reveals two new pathological scaling properties. Namely, the speedup may level off even in the case of the fixed-time workload underlying Gustafson's law, and it may peak and then fall as the system scales out. Extensive MapReduce and Spark-based case studies demonstrate that IPSO successfully captures diverse scaling properties of dataintensive applications. As a result, it can serve as a diagnostic tool to gain insights on or even uncover counter-intuitive root causes of observed scaling behaviors, especially pathological ones, for data-intensive applications. Finally, preliminary results also demonstrate the promising prospects of IPSO to facilitate effective resource provisioning to achieve the best speedup-versuscost tradeoffs for data-intensive applications.
Year
DOI
Venue
2019
10.1109/ICDCS.2019.00032
2019 IEEE 39th International Conference on Distributed Computing Systems (ICDCS)
Keywords
Field
DocType
scale-out workload, cloud computing, speedup, performance evaluation, Amdahl's Law, Gustafson's Law
Data modeling,Computer science,Amdahl's law,Workload,Server,Provisioning,Rendering (computer graphics),Scaling,Distributed computing,Speedup
Conference
ISSN
ISBN
Citations 
1063-6927
978-1-7281-2520-6
0
PageRank 
References 
Authors
0.34
12
6
Name
Order
Citations
PageRank
Zhongwei Li151.41
Feng Duan28727.49
Minh Nguyen381.90
Hao Che439129.65
Yu Lei5204.72
Hong Jiang62137157.96