Abstract | ||
---|---|---|
The Hadoop framework has gained significant attention from the scientific community due to its applicability to large-scale data analysis in many areas. This analysis often involves multiple stages of processing, which in turn, constitutes a workflow. While some stages of a workflow are mandatory, others are subject to the type of analysis to be done. In addition, a workflow may possess data dependencies between stages that must be enforced, and it may exhibit varying levels of sensitivity. The resources needed for such data analysis can range from a laptop to in-house clusters (or private cloud) to a public cloud. Managing such workflows, while using such a gamut of computing resources, is an unnecessarily arduous task for domain scientists. To address the above challenges, we present Aeromancer, a feature-rich workflow manager for running Map Reduce-based workflows that utilizes both client and cloud resources. Aeromancer offers an ensemble of features, including the simultaneous use of client resources (e.g., On-premises clusters) and public cloud resources, automatic data-dependency and data-transfer handling, intra-flow, on-demand cluster provisioning, and support for directed-acyclic graphs (DAGs). To demonstrate its functionality, we apply Aeromancer to several bioinformatics pipelines, as part of a \"big data\" case study in the life sciences, which seeks to increase the adoption of hybrid computing environments, including the emerging \"client cloud\" computing model, for running data-intensive workflows. |
Year | DOI | Venue |
---|---|---|
2014 | 10.1109/TrustCom.2014.97 | TrustCom |
Keywords | Field | DocType |
cloud, cluster, workflow, mapreduce, hadoop, hybrid resources,instruction sets,pipelines,bioinformatics,data analysis,cloud computing,computer architecture,servers | Data science,Workflow technology,Computer science,Computer security,Server,Provisioning,Workflow engine,Workflow management system,Workflow,Big data,Database,Cloud computing | Conference |
ISSN | Citations | PageRank |
2324-898X | 0 | 0.34 |
References | Authors | |
7 | 5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Nabeel Mohamed | 1 | 0 | 0.34 |
Nabanita Maji | 2 | 0 | 0.34 |
Jing Zhang | 3 | 70 | 6.53 |
Nataliya Timoshevskaya | 4 | 0 | 0.34 |
Wu-chun Feng | 5 | 2812 | 232.50 |