Title
Practical applicability of optimizations and performance models to complex stencil-based loop kernels in CFD
Abstract
AbstractThis work investigates the application and interaction of optimization techniques and performance models in a computational fluid dynamics (CFD) approach employing an OpenMP parallelized, explicit, weakly compressible, finite difference–based solver for the incompressible Navier–Stokes equations using a five-point wide stencil. The presented loop and stencil optimizations lead to a 6.8× increase in per core throughput. In order to verify optimal CPU utilization, performance models are applied to the tuned code. Three different performance models are considered: a roofline-based model, utilizing purely theoretical figures, one which is enhanced by measurements, and the execution cache memory model. It is shown that the models provide reliable estimates for simple benchmarks, such as seven-point stencils for scalar Laplacians, but the estimate quality is significantly worse for the complex and tuned stencil. While it is possible to include even more details in the model, it eventually leads to a state in which it purely reproduces the benchmarks from which it was derived. Thus, the applied general-purpose performance models are found to inaccurately predict the actual performance. They overestimate the achievable performance by more than about 97% for highly tuned code. Through further code tuning, 66% of the predicted performance could be achieved.
Year
DOI
Venue
2019
10.1177/1094342018774126
Periodicals
Keywords
Field
DocType
Performance modeling, performance optimization, stencil, finite difference
Computer science,Parallel computing,Stencil,Computational fluid dynamics
Journal
Volume
Issue
ISSN
33
4
1094-3420
Citations 
PageRank 
References 
0
0.34
16
Authors
4
Name
Order
Citations
PageRank
Karl-Robert Wichmann100.34
Martin Kronbichler232331.00
Rainald Löhner313815.24
Wolfgang A. Wall46122.07