Title | ||
---|---|---|
SciSpark: Applying in-memory distributed computing to weather event detection and tracking |
Abstract | ||
---|---|---|
In this paper we present SciSpark, a Big Data framework that extends Apache¿ Spark for scaling scientific computations. The paper details the initial architecture and design of SciSpark. We demonstrate how SciSpark achieves parallel ingesting and partitioning of earth science satellite and model datasets. We also illustrate the usability and extensibility of SciSpark by implementing aspects of the Grab 'em Tag 'em Graph 'em (GTG) algorithm using SciSpark and its Map Reduce capabilities. GTG is a topical automated method for identifying and tracking Mesoscale Convective Complexes in satellite infrared datasets. |
Year | DOI | Venue |
---|---|---|
2015 | 10.1109/BigData.2015.7363983 | Big Data |
Keywords | Field | DocType |
Apache Spark, in-memory distributed computing, large scientific datasets, mesoscale convective complexes | Graph,Data mining,Satellite,Spark (mathematics),Computer science,Mesoscale convective complex,Usability,Big data,Extensibility,Computation | Conference |
Citations | PageRank | References |
9 | 0.55 | 8 |
Authors | ||
8 |
Name | Order | Citations | PageRank |
---|---|---|---|
Rahul Palamuttam | 1 | 9 | 0.55 |
Renato Marroquin Mogrovejo | 2 | 9 | 0.55 |
Chris A. Mattmann | 3 | 200 | 25.39 |
Brian D. Wilson | 4 | 10 | 1.62 |
Kim Whitehall | 5 | 17 | 1.91 |
Rishi Verma | 6 | 15 | 2.04 |
Lewis J. McGibbney | 7 | 9 | 0.55 |
Paul M. Ramirez | 8 | 11 | 1.65 |