Title
Towards Reliable (and Efficient) Job Executions in a Practical Geo-distributed Data Analytics System.
Abstract
Geo-distributed data analytics are increasingly common to derive useful information in large organisations. Naive extension of existing cluster-scale data analytics systems to the scale of geo-distributed data centers faces unique challenges including WAN bandwidth limits, regulatory constraints, changeable/unreliable runtime environment, and monetary costs. Our goal in this work is to develop a practical geo-distribued data analytics system that (1) employs an intelligent mechanism for jobs to efficiently utilize (adjust to) the resources (changeable environment) across data centers; (2) guarantees the reliability of jobs due to the possible failures; and (3) is generic and flexible enough to run a wide range of data analytics jobs without requiring any changes. To this end, we present a new, general geo-distributed data analytics system, HOUTU, that is composed of multiple autonomous systems, each operating in a sovereign data center. HOUTU maintains a job manager (JM) for a geo-distributed job in each data center, so that these replicated JMs could individually and cooperatively manage resources and assign tasks. Our experiments on the prototype of HOUTU running across four Alibaba Cloud regions show that HOUTU provides nearly efficient job performance as in the existing centralized architecture, and guarantees reliable job executions when facing failures.
Year
Venue
Field
2018
arXiv: Distributed, Parallel, and Cluster Computing
Architecture,Data analysis,Computer science,Bandwidth (signal processing),Autonomous system (Internet),Job performance,Data center,Cloud computing,Distributed computing
DocType
Volume
Citations 
Journal
abs/1802.00245
0
PageRank 
References 
Authors
0.34
18
7
Name
Order
Citations
PageRank
Xiaoda Zhang102.03
Zhuzhong Qian238051.27
Sheng Zhang34415.62
Yize Li4433.29
Xiangbo Li500.34
Xiaoliang Wang69124.74
Sanglu Lu71380144.07