Abstract | ||
---|---|---|
Graphics Processing Units (GPUs) are becoming the workhorse of scalable computations. MADNESS is a scientific framework used especially for computational chemistry. Most MADNESS applications use operators that involve many small tensor computations, resulting in a less regular organization of computations on GPUs. A single GPU kernel may have to multiply by hundreds of small square matrices (with fixed dimension ranging from 10 to 28). We demonstrate a scalable CPU-GPU implementation of the MADNESS framework over a 500-node partition on the Titan supercomputer. For this hybrid CPU-GPU implementation, we observe up to a 2.3-times speedup compared to an equivalent CPU-only implementation with 16 cores per node. For smaller matrices, we demonstrate a speedup of 2.2-times by using a custom CUDA kernel rather than a cuBLAS-based kernel. |
Year | DOI | Venue |
---|---|---|
2012 | 10.1109/CLUSTER.2012.42 | CLUSTER |
Keywords | Field | DocType |
cublas-based kernel,scalable cpu-gpu implementation,large cpu-gpu clusters,madness framework,scientific framework,hybrid cpu-gpu implementation,custom cuda kernel,equivalent cpu-only implementation,madness application,adapting irregular computations,scalable computation,single gpu kernel,accuracy,instruction sets,tensile stress,kernel,supercomputing,statistical analysis,tensors,computational modeling | Kernel (linear algebra),Supercomputer,GPU cluster,Computer science,CUDA,Parallel computing,Computational science,Titan (supercomputer),Graphics processing unit,Speedup,Scalability | Conference |
ISSN | Citations | PageRank |
1552-5244 | 1 | 0.37 |
References | Authors | |
10 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Vlad Slavici | 1 | 7 | 1.90 |
Raghu Varier | 2 | 1 | 0.37 |
Gene Cooperman | 3 | 267 | 35.78 |
Robert J. Harrison | 4 | 769 | 74.50 |