Title
High-Throughput Elliptic Curve Cryptography Using AVX2 Vector Instructions
Abstract
Single Instruction Multiple Data (SIMD) execution engines like Intel's Advanced Vector Extensions 2 (AVX2) offer a great potential to accelerate elliptic curve cryptography compared to implementations using only basic x64 instructions. All existing AVX2 implementations of scalar multiplication on e.g. Curve25519 (and alternative curves) are optimized for low latency. We argue in this paper that many real-world applications, such as server-side SSL/TLS handshake processing, would benefit more from throughput-optimized implementations than latencyoptimized ones. To support this argument, we introduce a throughputoptimized AVX2 implementation of variable-base scalar multiplication on Curve25519 and fixed-base scalar multiplication on Ed25519. Both implementations perform four scalar multiplications in parallel, where each uses a 64-bit element of a 256-bit vector. The field arithmetic is based on a radix-229 representation of the field elements, which makes it possible to carry out four parallel multiplications modulo a multiple of p = 2(255) - 19 in just 88 cycles on a Skylake CPU. Four variable-base scalar multiplications on Curve25519 require less than 250,000 Skylake cycles, which translates to a throughput of 32,318 scalar multiplications per second at a clock frequency of 2GHz. For comparison, the to-date best latency-optimized AVX2 implementation has a throughput of some 21,000 scalar multiplications per second on the same Skylake CPU.
Year
DOI
Venue
2020
10.1007/978-3-030-81652-0_27
SELECTED AREAS IN CRYPTOGRAPHY
Keywords
DocType
Volume
Throughput-optimized cryptography, Curve25519, Single instruction multiple data (SIMD), Advanced vector extension 2 (AVX2)
Conference
12804
ISSN
Citations 
PageRank 
0302-9743
0
0.34
References 
Authors
0
5
Name
Order
Citations
PageRank
Hao Cheng104.73
Johann Großschädl2212.49
Jiaqi Tian300.34
Peter B. Rønne4129.33
Peter Y. A. Ryan572866.96