Title
SciSpark: Applying in-memory distributed computing to weather event detection and tracking
Abstract
In this paper we present SciSpark, a Big Data framework that extends Apache¿ Spark for scaling scientific computations. The paper details the initial architecture and design of SciSpark. We demonstrate how SciSpark achieves parallel ingesting and partitioning of earth science satellite and model datasets. We also illustrate the usability and extensibility of SciSpark by implementing aspects of the Grab 'em Tag 'em Graph 'em (GTG) algorithm using SciSpark and its Map Reduce capabilities. GTG is a topical automated method for identifying and tracking Mesoscale Convective Complexes in satellite infrared datasets.
Year
DOI
Venue
2015
10.1109/BigData.2015.7363983
Big Data
Keywords
Field
DocType
Apache Spark, in-memory distributed computing, large scientific datasets, mesoscale convective complexes
Graph,Data mining,Satellite,Spark (mathematics),Computer science,Mesoscale convective complex,Usability,Big data,Extensibility,Computation
Conference
Citations 
PageRank 
References 
9
0.55
8
Authors
8