Title
DFT-FE 1.0: A massively parallel hybrid CPU-GPU density functional theory code using finite-element discretization
Abstract
We present DFT-FE 1 . 0, building on DFT-FE 0 . 6 Motamarri et al. (2020) [28], to conduct fast and accurate large-scale density functional theory (DFT) calculations (reaching similar to 100, 000 electrons) on both many-core CPU and hybrid CPU-GPU computing architectures. This work involves improvements in the real-space formulation-via an improved treatment of the electrostatic interactions that substantially enhances the computational efficiency-as well high-performance computing aspects, including the GPU acceleration of all the key compute kernels in DFT - FE. We demonstrate the accuracy by comparing the ground-state energies, ionic forces and cell stresses on a wide-range of benchmark systems against those obtained from widely used DFT codes. Further, we demonstrate the numerical efficiency of our GPU acceleration, which yields similar to 20x speed-up on hybrid CPU-GPU nodes of the Summit supercomputer. Notably, owing to the parallel-scaling of the GPU implementation, we obtain wall-times of 80 - 140 seconds for full ground-state calculations, with stringent accuracy, on benchmark systems containing similar to 6, 000 - 15, 000 electrons using 64 - 224 nodes of the Summit supercomputer. Program summary Program Title: DFT-FE CPC Library link to program files: https://doi.org/10.17632/c5ghfc6ctn.1 Developer's repository link: https://github com/dftfeDevelopers/dftfe Licensing provisions: LGPL v3 Programming language: C/C++ External routines/libraries: p4est (http://www.p4est.org/), deal.II (https://www.dealii.org/), BLAS (http://www.netlib.org/blas/), LAPACK (http://www.netlib.org/lapack/), ELPA (https://elpa.mpcdf.mpg.de/), ScaLAPACK (http://www.netlib.org/scalapack/), Spglib (https://atztogo.github.io/spglib), ALGLIB (http://www.alglib.net/), LIBXC (http://www.tddft.org/programs/libxc/), PETSc (https://www.mcs.anl.gov/petsc), SLEPc (http://slepc.upv.es), NCCL (optical-https://github.com/NVIDIA/nccl). Nature of problem: Density functional theory calculations. Solution method: We employ a local real-space variational formulation of Kohn-Sham density functional theory that is applicable for both pseudopotential and all-electron calculations on periodic, semiperiodic and non-periodic geometries. Higher-order adaptive spectral finite-element basis is used to discretize the Kohn-Sham equations. Chebyshev polynomial filtered subspace iteration procedure (ChFSI) is employed to solve the nonlinear Kohn-Sham eigenvalue problem self-consistently. ChFSI in DFT-FE employs Cholesky factorization based orthonormalization, and spectrum splitting based Rayleigh-Ritz procedure in conjunction with mixed precision arithmetic. Configurational force approach is used to compute ionic forces and periodic cell stresses for geometry optimization. Additional comments including restrictions and unusual features: Exchange correlation functionals are restricted to Local Density Approximation (LDA) and Generalized Gradient Approximation (GGA), with and without spin. The pseudopotentials available are optimized norm conserving Vanderbilt (ONCV) pseudopotentials and Troullier-Martins (TM) pseudopotentials. Calculations are non-relativistic. DFT-FE handles all-electron and pseudopotential calculations in the same framework, while accommodating periodic, non-periodic and semi-periodic boundary conditions. (C) 2022 Elsevier B.V. All rights reserved.
Year
DOI
Venue
2022
10.1016/j.cpc.2022.108473
COMPUTER PHYSICS COMMUNICATIONS
Keywords
DocType
Volume
Electronic structure, Real-space, Spectral finite-elements, Mixed-precision arithmetic, Pseudopotential, All-electron, GPU
Journal
280
ISSN
Citations 
PageRank 
0010-4655
0
0.34
References 
Authors
0
5
Name
Order
Citations
PageRank
Sambit Das141.32
Phani Motamarri200.34
Vishal Subramanian300.34
David M. Rogers400.68
Vikram Gavini500.34