Title
Enabling ScientificWorkflow Reuse through Structured Composition of Dataflow and Control-Flow
Abstract
Data-centric scientific workflows are often modeled as dataflow process networks. The simplicity of the dataflow framework facilitates workflow design, analysis, and optimization. However, modeling "control-flow intensive" tasks using dataflow constructs often leads to overly complicated workflows that are hard to comprehend, reuse, and maintain. We describe a generic framework, based on scientific workflow templates and frames, for embedding control-flow intensive subtasks within dataflow process networks. This approach can seamlessly handle complex control-flow without sacrificing the benefits of dataflow. We illustrate our approach with a real-world scientific workflow from the astrophysics domain, requiring remote execution and file transfer in a semi-reliable environment. For such workflows, we also describe a 3-layered architecture based on frames and templates where the top-layer consists of an overall dataflow process network, the second layer consists of a tranducer template for modeling the desired control-flow behavior, and the bottom layer consists of frames inside the template that are specialized by embedding the desired component implementation. Our approach can enable scientific workflows that are more robust (faulttolerance strategies can be defined by control-flow driven transducer templates) and at the same time more reusable, since the embedding of frames and templates yields more structured and modular workflow designs.
Year
DOI
Venue
2006
10.1109/ICDEW.2006.55
Atlanta, GA, USA
Keywords
DocType
ISBN
overall dataflow process network,dataflow process network,modular workflow design,dataflow construct,enabling scientificworkflow reuse,control-flow intensive subtasks,structured composition,complicated workflows,dataflow framework,control-flow behavior,complex control-flow,data-centric scientific workflows,control flow,data flow,fault tolerant,scientific computing,bioinformatics,robust control,data engineering,genomics,mathematical model,data structures,computer science,layered architecture
Conference
0-7695-2571-7
Citations 
PageRank 
References 
38
1.79
21
Authors
4
Name
Order
Citations
PageRank
Shawn Bowers1122386.44
Bertram Ludascher2121395.07
Anne H. H. Ngu31084189.93
Terence Critchlow427535.97