Title | ||
---|---|---|
Towards Building A Scalable Data Analytics System On Clouds: An Early Experience On Alicloud |
Abstract | ||
---|---|---|
With the development of big data, big data processing systems, such as Hadoop and Spark, are widely used to handle large-scale data. To avoid the complexity and expensiveness of building a self-owned big data processing system, cloud providers tend to deploy big data processing tools as cloud services. Typical examples include Amazon EMR, Azure HDInsight and AliCloud E-MapReduce. However, how to build a cost-efficient system and scale the system is still challenging. In this paper, we have conducted a case study on AliCloud E-MapReduce, and analyzed the system performance upon local and remote file systems. We compared the scalability of Hadoop and Spark by using scaleout and scale-up strategies respectively. Based on the analysis results, we derive several observations and implications, which will contribute to guide the performance optimization. |
Year | DOI | Venue |
---|---|---|
2018 | 10.1109/CLOUD.2018.00129 | PROCEEDINGS 2018 IEEE 11TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING (CLOUD) |
Keywords | Field | DocType |
scalability evaluation, cloud-based data processing, SaaS | Big data processing,Data science,Spark (mathematics),Task analysis,Data analysis,Computer science,Big data,Benchmark (computing),Distributed computing,Cloud computing,Scalability | Conference |
Citations | PageRank | References |
0 | 0.34 | 0 |
Authors | ||
7 |
Name | Order | Citations | PageRank |
---|---|---|---|
Congfeng Jiang | 1 | 10 | 2.93 |
Wei Huang | 2 | 78 | 24.31 |
Zujie Ren | 3 | 89 | 8.14 |
Youhuizi Li | 4 | 727 | 31.40 |
Jian Wan | 5 | 483 | 56.15 |
Feng Cao | 6 | 5 | 1.84 |
Jiangbin Lin | 7 | 3 | 0.81 |