Title
Large data and computation in a hazard map workflow using Hadoop and Neteeza architectures
Abstract
Uncertainty Quantification(UQ) using simulation ensembles leads to twin challenges of managing large amount of data and performing cpu intensive computing. While algorithmic innovations using surrogates, localization and parallelization can make the problem feasible one still has very large data and compute tasks. Such integration of large data analytics and computationally expensive tasks is increasingly common. We present here an approach to solving this problem by using a mix of hardware and a workflow that maps tasks to appropriate hardware. We experiment with two computing environments -- the first is an integration of a Netezza data warehouse appliance and a high performance cluster and the second a hadoop based environment. Our approach is based on segregating the data intensive and compute intensive tasks and assigning the right architecture to each. We present here the computing models and the new schemes in the context of generating probabilistic hazard maps using ensemble runs of the volcanic debris avalanche simulator TITAN2D and UQ methodology.
Year
DOI
Venue
2013
10.1145/2534645.2534648
DISCS@SC
Field
DocType
Citations 
Data mining,Architecture,Uncertainty quantification,Data analysis,Computer science,Data warehouse appliance,Probabilistic logic,Hazard map,Workflow,Distributed computing,Computation
Conference
0
PageRank 
References 
Authors
0.34
2
3
Name
Order
Citations
PageRank
Shivaswamy Rohit100.34
Abani K. Patra210922.05
Vipin Chaudhary383883.24