Massively parallel regularized 3D inversion of potential fields on CPUs and GPUs - Citegraph

Paper Info

Title
Massively parallel regularized 3D inversion of potential fields on CPUs and GPUs

Abstract
We have recently introduced a massively parallel regularized 3D inversion of potential fields data. This program takes as an input gravity or magnetic vector, tensor and Total Magnetic Intensity (TMI) measurements and produces 3D volume of density, susceptibility, or three dimensional magnetization vector, the latest also including magnetic remanence information. The code uses combined MPI and OpenMP approach that maps well onto current multiprocessor multicore clusters and exhibits nearly linear strong and weak parallel scaling. It has been used to invert regional to continental size data sets with up to billion cells of the 3D Earth's volume on large clusters for interpretation of large airborne gravity and magnetics surveys. In this paper we explain the features that made this massive parallelization feasible and extend the code to add GPU support in the form of the OpenACC directives. This implementation resulted in up to a 22x speedup as compared to the scalar multithreaded implementation on a 12 core Intel CPU based computer node. Furthermore, we also introduce a mixed single-double precision approach, which allows us to perform most of the calculation at a single floating point number precision while keeping the result as precise as if the double precision had been used. This approach provides an additional 40% speedup on the GPUs, as compared to the pure double precision implementation. It also has about half of the memory footprint of the fully double precision version. We describe implementation of scalable massively parallel potential fields modeling and inversion program.The code is capable of inverting gravity data for density and magnetics data for susceptibility or magnetization vector.Key features that allow scalability and good performance are use of the moving sensitivity domain around each data receiver and on-demand calculation of the sensitivity.Further improvement in performance is gained by use of mixed single-double precision arithmetic.The code is parallelized with MPI and OpenMP and alternatively the computation heavy kernels can be offloaded to GPUs using OpenACC.

Year	DOI	Venue
2014	10.1016/j.cageo.2013.10.004	Computers & Geosciences
Keywords	Field	DocType
parallel computing	Massively parallel,Floating point,Computer science,Double-precision floating-point format,Parallel computing,Scalar (physics),Multiprocessing,Computational science,Memory footprint,Multi-core processor,Speedup	Journal
Volume	Issue	ISSN
62,	C	0098-3004
Citations	PageRank	References
3	0.53	6
Authors
2

Authors (2 rows)

Cited by (3 rows)

References (6 rows)

Name	Order	Citations	PageRank
Martin Cuma	1	16	2.28
Michael S. Zhdanov	2	15	6.85

1