Title
A Data Intensive Statistical Aggregation Engine: A Case Study for Gridded Climate Records
Abstract
Satellite derived climate instrument records are often highly structured and conform to the "Data-Cube" topology. However, data scales on the order of tens to hundreds of Terabytes make it more difficult to perform the rigorous statistical aggregation and analytics necessary to investigate how our climate is changing over time and space. It is especially cumbersome to supply the full derivation (provenance) of this analysis, as is increasingly required by scientific conferences and journals. In this paper, we address our approach toward the creation of a 55 Terabyte decadal record of Outgoing Long wave Spectrum (OLS) from the NASA Atmospheric Infrared Sounder (AIRS), and describe our open source data-intensive statistical aggregation engine "Gridderama" intended primarily for climate trend analysis, and may be applicable to other aggregation problems involving large structured datasets.
Year
DOI
Venue
2013
10.1109/IPDPSW.2013.87
Parallel and Distributed Processing Symposium Workshops & PhD Forum
Keywords
Field
DocType
data intensive statistical aggregation,rigorous statistical aggregation,gridded climate,data scale,case study,large structured datasets,climate trend analysis,statistical aggregation engine,infrared sounder,climate instrument record,aggregation problem,long wave,nasa atmospheric,market research,topology,statistical analysis,parallel processing,public domain software,workflow,big data,artificial satellites,aggregation,meteorology,data handling,engines
Data science,Data mining,Atmospheric Infrared Sounder,Trend analysis,Satellite,Terabyte,Computer science,As is,Analytics,Big data,Group method of data handling
Conference
ISBN
Citations 
PageRank 
978-0-7695-4979-8
0
0.34
References 
Authors
9
4
Name
Order
Citations
PageRank
David Chapman100.34
Tyler Simon2457.29
Phuong Nguyen3516.56
Milton Halem48629.78