Title
Leveraging Distributed Big Data Storage Support In Claaas For Wings Workflow Management System
Abstract
Cloud-based Analytics-as-a-Service (CLAaaS) was developed by Zulkernine et al. with a goal to simplifying big data analytics users. It provides software-as-a-service access to a variety of back end analytics tools and data stores. One of the tools is the Workflow Instance Generation and Selection (WINGS). WINGS allows users to reuse predefined workflows and their components containing semantic meta-data to define new workflows; late binding of the workflows to data at the time of execution to enable the use of most recent data, and definition of domain specific software code as custom analytic components in workflows. However, the data used in WINGS for the workflows are mostly flat files that are stored on the WINGS server or shared directories. The goal of this project is to add support for big data storage systems to WINGS and validate the extensions using multiple data analytic workflows of different complexities with data residing in a variety of back end data sources. The extension allows the CLAaaS users to create, validate and execute analytic workflows in a distributed environment and use data from multiple big data storage systems. We validate our work using four big data storage systems in WINGS workflows namely, Apache HBase, MongoDB, MySQL with a front-end interface.
Year
DOI
Venue
2017
10.1109/BigData.2017.8258200
2017 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA)
Keywords
DocType
ISSN
Distributed environment, WINGS, CLAaaS, HBase, MongoDB, Components.
Conference
2639-1589
Citations 
PageRank 
References 
0
0.34
0
Authors
3
Name
Order
Citations
PageRank
Hadeel Alghamdi100.34
Farhana Zulkernine216616.28
patrick martin314818.22