<italic>Astrea:</italic> Auto-Serverless Analytics Towards Cost-Efficiency and QoS-Awareness - Citegraph

Paper Info

Title
<italic>Astrea:</italic> Auto-Serverless Analytics Towards Cost-Efficiency and QoS-Awareness

Abstract
With the ability to simplify the code deployment with one-click upload and lightweight execution, serverless computing has emerged as a promising paradigm with increasing popularity. However, there remain open challenges when adapting data-intensive analytics applications to the serverless context, in which users of <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">serverless analytics</i> encounter the difficulty in coordinating computation across different stages and provisioning resources in a large configuration space. This paper presents our design and implementation of <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Astrea</i> , which configures and orchestrates serverless analytics jobs in an autonomous manner, while taking into account flexibly-specified user requirements. <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Astrea</i> relies on the modeling of performance and cost which characterizes the intricate interplay among multi-dimensional factors (e.g., function memory size, degree of parallelism at each stage). We formulate an optimization problem based on user-specific requirements towards performance enhancement or cost reduction, and develop a set of algorithms based on graph theory to obtain the optimal job execution. We deploy <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Astrea</i> in the AWS Lambda platform and conduct real-world experiments over representative benchmarks, including Big Data analytics and machine learning workloads, at different scales. Extensive results demonstrate that <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Astrea</i> can achieve the optimal execution decision for serverless data analytics, in comparison with various provisioning and deployment baselines. For example, when compared with three provisioning baselines, <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Astrea</i> manages to reduce the job completion time by 21% to 69% under a given budget constraint, while saving cost by 20% to 84% without violating performance requirements.

Year	DOI	Venue
2022	10.1109/TPDS.2022.3172069	IEEE Transactions on Parallel and Distributed Systems
Keywords	DocType	Volume
Cloud computing,serverless computing,resource provisioning,modeling,optimization	Journal	33
Issue	ISSN	Citations
12	1045-9219	0
PageRank	References	Authors
0.34	21	4

Authors (4 rows)

Cited by (0 rows)

References (21 rows)

Name	Order	Citations	PageRank
Jananie Jarachanthan	1	0	0.34
Li Chen	2	2	1.06
Fei Xu	3	0	0.68
Baochun Li	4	9416	614.20

1