Efficient implementation of quantum materials simulations on distributed CPU-GPU systems - Citegraph

Paper Info

Title
Efficient implementation of quantum materials simulations on distributed CPU-GPU systems

Abstract
We present a scalable implementation of the Linearized Augmented Plane Wave method for distributed memory systems, which relies on an efficient distributed, block-cyclic setup of the Hamiltonian and overlap matrices and allows us to turn around highly accurate 1000+ atom all-electron quantum materials simulations on clusters with a few hundred nodes. The implementation runs efficiently on standard multi-core CPU nodes, as well as hybrid CPU-GPU nodes. The key for the latter is a novel algorithm to solve the generalized eigenvalue problem for dense, complex Hermitian matrices on distributed hybrid CPU-GPU systems. Performance tests for Li-intercalated CoO2 supercells containing 1501 atoms demonstrate that high-accuracy, transferable quantum simulations can now be used in throughput materials search problems. While our application can benefit and get scalable performance through CPU-only libraries like ScaLAPACK or ELPA2, our new hybrid solver enables the efficient use of GPUs and shows that a hybrid CPU-GPU architecture scales to a desired performance using substantially fewer cluster nodes, and notably, is considerably more energy efficient than the traditional multi-core CPU only systems for such complex applications.

Year	DOI	Venue
2015	10.1145/2807591.2807654	International Conference for High Performance Computing, Networking, Storage, and Analysis
Keywords	Field	DocType
distributed CPU-GPU systems,linearized augmented plane wave method,distributed memory system,efficient distributed block-cyclic setup,Hamiltonian matrices,overlap matrices,all-electron quantum materials simulation,multicore CPU node,hybrid CPU-GPU node,generalized eigenvalue problem,dense complex Hermitian matrices,Li-intercalated CoO2 supercells,high-accuracy transferable quantum simulation,ScaLAPACK,ELPA2,hybrid CPU-GPU architecture	Computer science,Matrix (mathematics),Load balancing (computing),Efficient energy use,Parallel computing,ScaLAPACK,Eigendecomposition of a matrix,Solver,Hermitian matrix,Scalability,Distributed computing	Conference
ISBN	Citations	PageRank
978-1-5090-0273-3	4	0.49
References	Authors
17	6

Authors (6 rows)

Cited by (4 rows)

References (17 rows)

Name	Order	Citations	PageRank
Raffaele Solcà	1	35	3.74
Anton Kozhevnikov	2	6	1.48
Azzam Haidar	3	409	35.39
Stanimire Tomov	4	1214	102.02
Jack J. Dongarra	5	17625	2615.79
Thomas C. Schulthess	6	106	15.16

1