Title
A data-aware workflow scheduling algorithm for heterogeneous distributed systems
Abstract
The workflow scheduling problem in heterogeneous distributed systems is hard to solve due to both intermediate data transfer time and the computation time for each task being considered. The heterogeneity of the computing power of distributed computational sites and the band width between them makes the scheduling problem challenging. In this study, we improve a heuristic-based data-aware algorithm to find the optimal scheduling so that the turnaround time of the workflow is minimized. Our improved algorithm outperforms the existing algorithms in both performance and time efficiency in most cases. We also extend our algorithm to solve the co-scheduling problem. In this problem, each task of the workflow can request data from a remote data site before its execution; and also store important intermediate data to a remote data site after the execution. The results show that the turnaround time of the workflow can be shortened significantly using our data-aware algorithm compared to the existing optimal algorithms.
Year
DOI
Venue
2011
10.1109/HPCSim.2011.5999814
HPCS
Keywords
Field
DocType
data-aware workflow scheduling algorithm,scheduling,optimal scheduling,heterogeneous distributed system,large scale systems,workflow scheduling,data intensive supercomputing,grid and cluster computing,large scale scientific computing,distributed processing,heuristic-based data-aware algorithm,scheduling algorithm,distributed computing,scientific computing,data transfer,distributed database,cluster computing,distributed databases,scheduling problem,bandwidth
Heuristic,Job shop scheduling,Fair-share scheduling,Computer science,Scheduling (computing),Algorithm,Real-time computing,Turnaround time,Dynamic priority scheduling,Workflow management system,Workflow,Distributed computing
Conference
ISBN
Citations 
PageRank 
978-1-61284-380-3
2
0.37
References 
Authors
9
2
Name
Order
Citations
PageRank
Dengpan Yin1463.64
Kosar, Tevfik261448.67