Title
Aeromancer: A Workflow Manager for Large-Scale MapReduce-Based Scientific Workflows
Abstract
The Hadoop framework has gained significant attention from the scientific community due to its applicability to large-scale data analysis in many areas. This analysis often involves multiple stages of processing, which in turn, constitutes a workflow. While some stages of a workflow are mandatory, others are subject to the type of analysis to be done. In addition, a workflow may possess data dependencies between stages that must be enforced, and it may exhibit varying levels of sensitivity. The resources needed for such data analysis can range from a laptop to in-house clusters (or private cloud) to a public cloud. Managing such workflows, while using such a gamut of computing resources, is an unnecessarily arduous task for domain scientists. To address the above challenges, we present Aeromancer, a feature-rich workflow manager for running Map Reduce-based workflows that utilizes both client and cloud resources. Aeromancer offers an ensemble of features, including the simultaneous use of client resources (e.g., On-premises clusters) and public cloud resources, automatic data-dependency and data-transfer handling, intra-flow, on-demand cluster provisioning, and support for directed-acyclic graphs (DAGs). To demonstrate its functionality, we apply Aeromancer to several bioinformatics pipelines, as part of a \"big data\" case study in the life sciences, which seeks to increase the adoption of hybrid computing environments, including the emerging \"client cloud\" computing model, for running data-intensive workflows.
Year
DOI
Venue
2014
10.1109/TrustCom.2014.97
TrustCom
Keywords
Field
DocType
cloud, cluster, workflow, mapreduce, hadoop, hybrid resources,instruction sets,pipelines,bioinformatics,data analysis,cloud computing,computer architecture,servers
Data science,Workflow technology,Computer science,Computer security,Server,Provisioning,Workflow engine,Workflow management system,Workflow,Big data,Database,Cloud computing
Conference
ISSN
Citations 
PageRank 
2324-898X
0
0.34
References 
Authors
7
5
Name
Order
Citations
PageRank
Nabeel Mohamed100.34
Nabanita Maji200.34
Jing Zhang3706.53
Nataliya Timoshevskaya400.34
Wu-chun Feng52812232.50