Title
Vectorization of a spectral finite-element numerical kernel
Abstract
In this paper, we present an optimized implementation of the Finite-Element Methods numerical kernel for SIMD vectorization. A typical application is the modelling of seismic wave propagation. In this case, the computations at the element level are generally based on nested loops where the memory accesses are non-contiguous. Moreover, the back and forth from the element level to the global level (e.g., assembly phase) is a serious brake for automatic vectorization by compilers and for efficient reuse of data at the cache memory levels. This is particularly true when the problem under study relies on an unstructured mesh. The application proxies used for our experiments were extracted from EFISPEC code that implements the spectral finite-element method to solve the elastodynamic equations. We underline that the intra-node performance may be further improved. Additionally, we show that standard compilers such as GNU GCC, Clang and Intel ICC are unable to perform automatic vectorization even when the nested loops were reorganized or when SIMD pragmas were added. Due to the irregular memory access pattern, we introduce a dedicated strategy to squeeze the maximum performance out of the SIMD units. Experiments are carried out on Intel Broadwell and Skylake platforms that respectively offer AVX2 and AVX-512 SIMD units. We believe that our vectorization approach may be generic enough to be adapted to other codes.
Year
DOI
Venue
2018
10.1145/3178433.3178441
WPMVP@PPoPP
Field
DocType
ISBN
Kernel (linear algebra),CPU cache,Computer science,Parallel computing,SIMD,Vectorization (mathematics),Finite element method,Compiler,Nested loop join,Computation
Conference
978-1-4503-5646-6
Citations 
PageRank 
References 
2
0.39
12
Authors
3
Name
Order
Citations
PageRank
Sylvain Jubertie1255.70
Fabrice Dupros210011.40
Florent De Martin3101.68