Title | ||
---|---|---|
H-WorD: Supporting Job Scheduling in Hadoop with Workload-Driven Data Redistribution. |
Abstract | ||
---|---|---|
Today’s distributed data processing systems typically follow a query shipping approach and exploit data locality for reducing network traffic. In such systems the distribution of data over the cluster resources plays a significant role, and when skewed, it can harm the performance of executing applications. In this paper, we address the challenges of automatically adapting the distribution of data in a cluster to the workload imposed by the input applications. We propose a generic algorithm, named H-WorD, which, based on the estimated workload over resources, suggests alternative execution scenarios of tasks, and hence identifies required transfers of input data a priori, for timely bringing data close to the execution. We exemplify our algorithm in the context of MapReduce jobs in a Hadoop ecosystem. Finally, we evaluate our approach and demonstrate the performance gains of automatic data redistribution. |
Year | Venue | Field |
---|---|---|
2016 | ADBIS | Information system,Locality,Scheduling (computing),Workload,Computer science,Data processing system,Exploit,Job scheduler,Group method of data handling,Database,Distributed computing |
DocType | Citations | PageRank |
Conference | 0 | 0.34 |
References | Authors | |
7 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Petar Jovanovic | 1 | 62 | 7.78 |
Oscar Romero | 2 | 467 | 35.46 |
Toon Calders | 3 | 1333 | 93.66 |
Alberto Abelló | 4 | 848 | 61.88 |