Flint: batch-interactive data-intensive processing on transient servers. - Citegraph

Paper Info

Title
Flint: batch-interactive data-intensive processing on transient servers.

Abstract
Cloud providers now offer transient servers, which they may revoke at anytime, for significantly lower prices than on-demand servers, which they cannot revoke. The low price of transient servers is particularly attractive for executing an emerging class of workload, which we call Batch-Interactive Data-Intensive (BIDI), that is becoming increasingly important for data analytics. BIDI workloads require large sets of servers to cache massive datasets in memory to enable low latency operation. In this paper, we illustrate the challenges of executing BIDI workloads on transient servers, where revocations (akin to failures) are the common case. To address these challenges, we design Flint, which is based on Spark and includes automated checkpointing and server selection policies that i) support batch and interactive applications and ii) dynamically adapt to application characteristics. We evaluate a prototype of Flint using EC2 spot instances, and show that it yields cost savings of up to 90% compared to using on-demand servers, while increasing running time by < 2%.

Year	DOI	Venue
2016	10.1145/2901318.2901319	EuroSys
Field	DocType	Citations
Spark (mathematics),Virtual machine,Workload,Computer science,Cache,Server,Real-time computing,Latency (engineering),Multi-core processor,Operating system,Cloud computing,Distributed computing	Conference	24
PageRank	References	Authors
0.86	27	5

Authors (5 rows)

Cited by (24 rows)

References (27 rows)

Name	Order	Citations	PageRank
Prateek Sharma	1	201	14.12
Tian Guo	2	66	12.57
Xin He	3	46	2.25
David E. Irwin	4	899	98.12
Prashant J. Shenoy	5	6386	521.30

1