Title | ||
---|---|---|
Practically efficient methods for performing bit-reversed permutation in C++11 on the x86-64 architecture. |
Abstract | ||
---|---|---|
The bit-reversed permutation is a famous task in signal processing and is key to efficient implementation of the fast Fourier transform. This paper presents optimized C++11 implementations of five extant methods for computing the bit-reversed permutation: Stockham auto-sort, naive bitwise swapping, swapping via a table of reversed bytes, local pairwise swapping of bits, and swapping via a cache-localized matrix buffer. Three new strategies for performing the bit-reversed permutation in C++11 are proposed: an inductive method using the bitwise XOR operation, a template-recursive closed form, and a cache-oblivious template-recursive approach, which reduces the bit-reversed permutation to smaller bit-reversed permutations and a square matrix transposition. These new methods are compared to the extant approaches in terms of theoretical runtime, empirical compile time, and empirical runtime. The template-recursive cache-oblivious method is shown to be competitive with the fastest known method; however, we demonstrate that the cache-oblivious method can more readily benefit from parallelization on multiple cores and on the GPU. |
Year | Venue | DocType |
---|---|---|
2017 | CoRR | Journal |
Volume | Citations | PageRank |
abs/1708.01873 | 0 | 0.34 |
References | Authors | |
0 | 7 |
Name | Order | Citations | PageRank |
---|---|---|---|
Christian Knauth | 1 | 0 | 0.34 |
Boran Adas | 2 | 0 | 0.34 |
Daniel Whitfield | 3 | 0 | 0.34 |
Xuesong Wang | 4 | 37 | 3.61 |
Lydia Ickler | 5 | 0 | 0.34 |
Tim Conrad | 6 | 0 | 0.68 |
Oliver Serang | 7 | 0 | 1.01 |