Title
A Multi-GPU implementation of a d2q37 lattice boltzmann code
Abstract
We describe a parallel implementation of a compressible Lattice Boltzmann code on a multi-GPU cluster based on Nvidia Fermi processors. We analyze how to optimize the algorithm for GP-GPU architectures, describe the implementation choices that we have adopted and compare our performance results with an implementation optimized for latest generation multi-core CPUs. Our program runs at ≈30% of the double-precision peak performance of one GPU and shows almost linear scaling when run on the multi-GPU cluster.
Year
DOI
Venue
2011
10.1007/978-3-642-31464-3_65
Lecture Notes in Computer Science
Keywords
DocType
Volume
double-precision peak performance,performance result,parallel implementation,implementation choice,lattice boltzmann code,multi-gpu implementation,latest generation multi-core cpus,gp-gpu architecture,multi-gpu cluster,nvidia fermi processor,linear scaling,compressible lattice boltzmann code,computational fluid dynamics,lattice boltzmann methods
Conference
7203
ISSN
Citations 
PageRank 
0302-9743
7
0.86
References 
Authors
6
9