Title | ||
---|---|---|
<italic>Astrea:</italic> Auto-Serverless Analytics Towards Cost-Efficiency and QoS-Awareness |
Abstract | ||
---|---|---|
With the ability to simplify the code deployment with one-click upload and lightweight execution, serverless computing has emerged as a promising paradigm with increasing popularity. However, there remain open challenges when adapting data-intensive analytics applications to the serverless context, in which users of
<italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">serverless analytics</i>
encounter the difficulty in coordinating computation across different stages and provisioning resources in a large configuration space. This paper presents our design and implementation of
<italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Astrea</i>
, which configures and orchestrates serverless analytics jobs in an autonomous manner, while taking into account flexibly-specified user requirements.
<italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Astrea</i>
relies on the modeling of performance and cost which characterizes the intricate interplay among multi-dimensional factors (e.g., function memory size, degree of parallelism at each stage). We formulate an optimization problem based on user-specific requirements towards performance enhancement or cost reduction, and develop a set of algorithms based on graph theory to obtain the optimal job execution. We deploy
<italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Astrea</i>
in the AWS Lambda platform and conduct real-world experiments over representative benchmarks, including Big Data analytics and machine learning workloads, at different scales. Extensive results demonstrate that
<italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Astrea</i>
can achieve the optimal execution decision for serverless data analytics, in comparison with various provisioning and deployment baselines. For example, when compared with three provisioning baselines,
<italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Astrea</i>
manages to reduce the job completion time by 21% to 69% under a given budget constraint, while saving cost by 20% to 84% without violating performance requirements. |
Year | DOI | Venue |
---|---|---|
2022 | 10.1109/TPDS.2022.3172069 | IEEE Transactions on Parallel and Distributed Systems |
Keywords | DocType | Volume |
Cloud computing,serverless computing,resource provisioning,modeling,optimization | Journal | 33 |
Issue | ISSN | Citations |
12 | 1045-9219 | 0 |
PageRank | References | Authors |
0.34 | 21 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Jananie Jarachanthan | 1 | 0 | 0.34 |
Li Chen | 2 | 2 | 1.06 |
Fei Xu | 3 | 0 | 0.68 |
Baochun Li | 4 | 9416 | 614.20 |