Title
Polyflow: A SOA for Analyzing Workflow Heterogeneous Provenance Data in Distributed Environments
Abstract
In the last decade the (big) data-driven science paradigm became a wide-spread reality. However, this approach has some limitations such as a performance dependency on the quality of the data and the lack of reproducibility of the results. In order to enable this reproducibility, many tools such as Workflow Management Systems were developed to formalize process pipelines and capture execution traces. However, interoperating data generated by these solutions became a problem, since most systems adopted proprietary data models. To support interoperability across heterogeneous provenance data, we propose a Service Oriented Architecture with a polystore storage design in which provenance is conceptually represented utilizing the ProvONE model. A wrapper layer is responsible for transforming data described by heterogeneous formats into ProvONE-compliant. Moreover, we propose a query layer that provides location and access transparency to users. Furthermore, we conduct two feasibility studies, showcasing real usecase scenarios. Firstly, we illustrate how two research groups can compare their processes and results. Secondly, we show how our architecture can be used as a queriable provenance repository. We show Polyflow's viability for both scenarios using the Goal-Question-Metric methodology. Finally, we show our solution usability and extensibility appeal by comparing it to similar approaches.
Year
DOI
Venue
2019
10.1145/3330204.3330259
Proceedings of the XV Brazilian Symposium on Information Systems
Keywords
Field
DocType
Workflows interoperability, heterogeneous provenance data integration, polystore
Software engineering,Computer science,Provenance,Workflow
Conference
ISBN
Citations 
PageRank 
978-1-4503-7237-4
0
0.34
References 
Authors
0
4
Name
Order
Citations
PageRank
Yan Mendes100.34
Regina M. M. Braga29425.25
Victor Ströele301.01
Daniel de Oliveira461759.87