Abstract | ||
---|---|---|
Datacenters are under-utilized, primarily due to unused resources on over-provisioned nodes of latency-critical jobs. Such idle resources can be used to run batch data analytic jobs to increase datacenter utilization, but these transient resources must be evicted whenever latency-critical jobs require them again. Resource evictions often lead to cascading recomputations, which is usually handled by checkpointing intermediate results on stable storages of eviction-free reserved resources. However, checkpointing has major shortcomings in its substantial overhead of transferring data back and forth. In this work, we step away from such approaches and focus on observing the job structure and the relationships between computations of the job. We carefully mark the computations that are most likely to cause a large number of recomputations upon evictions, to run them reliably using reserved resources. This lets us retain corresponding intermediate results effortlessly without any additional checkpointing. We design Pado, a general data processing engine, which carries out our idea with several optimizations that minimize the number of additional reserved nodes. Evaluation results show that Pado outperforms Spark 2.0.0 by up to 5.1×, and checkpoint-enabled Spark by up to 3.8×. |
Year | DOI | Venue |
---|---|---|
2017 | 10.1145/3064176.3064181 | EuroSys |
Field | DocType | Citations |
Data processing,Spark (mathematics),Idle,Computer science,Real-time computing,Operating system,Computation,Distributed computing | Conference | 10 |
PageRank | References | Authors |
0.62 | 17 | 8 |
Name | Order | Citations | PageRank |
---|---|---|---|
Youngseok Yang | 1 | 14 | 1.70 |
Geon-Woo Kim | 2 | 10 | 0.62 |
Won Wook Song | 3 | 11 | 0.98 |
Yunseong Lee | 4 | 15 | 2.72 |
Andrew Chung | 5 | 44 | 3.57 |
Zhengping Qian | 6 | 350 | 17.04 |
Brian Cho | 7 | 199 | 15.57 |
Byung-Gon Chun | 8 | 3832 | 234.37 |