Abstract | ||
---|---|---|
In this paper we propose an implementation of the fast Fourier transform (FFT) targeting the ARM Scalable Vector Extension (SVE). We performed automatic vectorization via a compiler and an explicit vectorization through code generation by SPIRAL for FFT kernels, and compared the performance. We show that the explicit vectorization of SPIRAL generated code improves performance significantly. Performance results of FFTs on RIKEN's Fugaku processor simulator are reported. With the ARM compiler SPIRAL-generated FFT kernels written in SVE intrinsic are up to 3.16 times faster than FFT kernels of FFTE written in Fortran and up to 5.62 times faster than SPIRAL-generated FFT kernels written in C.
|
Year | DOI | Venue |
---|---|---|
2020 | 10.1145/3368474.3368488 | Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region |
Keywords | DocType | ISBN |
ARM SVE, FFT, SPIRAL, vectorization | Conference | 978-1-4503-7236-7 |
Citations | PageRank | References |
2 | 0.41 | 0 |
Authors | ||
2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Daisuke Takahashi | 1 | 2 | 0.41 |
Franz Franchetti | 2 | 974 | 88.39 |