Title
Improving the understanding of provenance and reproducibility of a multi-sensor merged climate data record
Abstract
Multi-decadal climate data records are critical to studying climate variability and change. These often also require merging data from multiple instruments such as those from NASA's A-Train that contain measurements covering a wide range of atmospheric conditions and phenomena. Multi-decadal climate data record of water vapor measurements from sensors on A-Train, operational weather, and other satellites are being assembled from existing data sources, or produced from well-established methods published in peer-reviewed literature. However, the immense volume and inhomogeneity of data often requires an "exploratory computing" approach to product generation where data is processed in a variety of different ways with varying algorithms, parameters, and code changes until an acceptable product is generated. Furthermore, the data product information associated with source data, processing methods, parameters used, intermediate & final product outputs, and associated materials are often hidden in each of the trials and scattered throughout the processing system(s). We will present methods to help users better capture and explore the production legacy of the data, metadata, ancillary files, code, and computing environment changes used during the production of these merged and multi-sensor data products. By building provenance services on semantic and provenance technologies, we show how to leverage provenance-as-a-service to capture sufficient information to enable users to track processing, perform faceted searches on the provenance record, and visualize the provenance of the products and processing lineage. We will also present services for capturing sufficient provenance information and the associated artifacts to enable some reproducibility of these climate data records.
Year
DOI
Venue
2012
10.1007/978-3-642-34222-6_24
IPAW
Keywords
Field
DocType
multi-sensor data product,multi-sensor merged climate data,provenance record,climate data record,sufficient provenance information,provenance service,source data,data product information,provenance technology,data source,multi-decadal climate data record
Metadata,Data mining,Satellite,Final product,Computer science,Source data,Visualization,Sensor fusion,Provenance,Web service,Database
Conference
Volume
ISSN
Citations 
7525
0302-9743
0
PageRank 
References 
Authors
0.34
0
5
Name
Order
Citations
PageRank
Hook Hua102.03
Brian Wilson251.65
Gerald Manipon351.31
Lei Pan4299.49
Eric Fetzer5129.54