Abstract | ||
---|---|---|
This paper compares two implementations of reconfigurable and high-throughput turbo decoders. The first implementation is optimized for an NVIDIA Kepler graphics processing unit (GPU), whereas the second implementation is for an Intel Ivy Bridge processor. Both implementations support max-log-MAP and log-MAP turbo decoding algorithms, various code rates, different interleaver types, and all block-lengths, as specified by HSPA+ and LTE-Advanced. In order to ensure a fair comparison between both implementations, we perform device-specific optimizations to improve the decoding throughput and error-rate performance. Our results show that the Intel Ivy Bridge processor implementation achieves up to 2x higher decoding throughput than our GPU implementation. In addition our CPU implementation requires roughly 4x fewer codewords to be processed in parallel to achieve its peak throughput. |
Year | DOI | Venue |
---|---|---|
2013 | 10.1109/ACSSC.2013.6810402 | 2013 ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS |
Keywords | Field | DocType |
measurement,decoding,throughput,turbo codes,vectors | Turbo,Computer architecture,Ivy Bridge,CUDA,Computer science,Parallel computing,Turbo code,Throughput,Decoding methods,Graphics processing unit,LTE Advanced | Conference |
ISSN | Citations | PageRank |
1058-6393 | 6 | 0.66 |
References | Authors | |
9 | 5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Michael Wu | 1 | 271 | 18.30 |
Guohui Wang | 2 | 1088 | 60.78 |
Bei Yin | 3 | 212 | 14.61 |
Christoph Studer | 4 | 1097 | 85.83 |
Joseph R. Cavallaro | 5 | 1175 | 115.35 |