Title
Understanding provenance black boxes
Abstract
Current provenance stores associated with workflow management systems (WfMSs) capture enough coarse-grained information to describe which datasets were used and which processes were run. While this information is enough to rebuild a workflow run, it is not enough to facilitate user understanding. Because the data is manipulated via a series of black boxes, it is often impossible for a human to understand what happened to the data. In this work, we highlight the missing information that can assist user understanding. Unfortunately, provenance information is already very complex and difficult for a user to comprehend, which can be exacerbated by adding the extra information needed for deeper blackbox understanding. In order to alleviate this, we develop a model of provenance answers that follow a "roll up", "drill down" strategy. We evaluate these techniques to determine if users have better understanding of provenance information. We show how this information can be captured by workflow management systems, and that the structures and information needed for this model are a negligible addition to standard provenance stores. Finally, we implement these techniques in a real provenance system, and evaluate implementation feasibility.
Year
DOI
Venue
2010
10.1007/s10619-009-7058-3
Distributed and Parallel Databases
Keywords
Field
DocType
Provenance,Lineage,Workflow management systems,Usability
Data science,World Wide Web,Workflow technology,Computer science,Usability,Drill down,Provenance,Black box,Workflow engine,Workflow,Workflow management system,Distributed computing
Journal
Volume
Issue
ISSN
27
2
0926-8782
Citations 
PageRank 
References 
5
0.44
35
Authors
2
Name
Order
Citations
PageRank
Adriane Chapman138227.65
H. V. Jagadish2111412495.67