Title
Flexible Data-Aware Scheduling for Workflows over an In-memory Object Store
Abstract
This paper explores novel techniques for improving the performance of many-task workflows based on the Swift scripting language. We propose novel programmer options for automated distributed data placement and task scheduling. These options trigger a data placement mechanism used for distributing intermediate workflow data over the servers of Hercules, a distributed key-value store that can be used to cache file system data. We demonstrate that these new mechanisms can significantly improve the aggregated throughput of many-task workflows with up to 86x, reduce the contention on the shared file system, exploit the data locality, and trade off locality and load balance.
Year
DOI
Venue
2016
10.1109/CCGrid.2016.40
2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid)
Keywords
Field
DocType
scientific workflows,file systems,data locality,load balance,high performance
Locality,File system,Scheduling (computing),Load balancing (computing),Cache,Computer science,Server,Workflow,Operating system,Distributed computing,Scripting language
Conference
ISSN
ISBN
Citations 
2376-4414
978-1-5090-2454-4
3
PageRank 
References 
Authors
0.42
7
6
Name
Order
Citations
PageRank
Francisco Rodrigo Duro1283.34
Javier García2479.85
Florin Isaila323424.01
Justin M. Wozniak446435.32
Jesús Carretero555269.87
Robert Ross62717173.13