Title
Scalability analysis of AVX-512 extensions
Abstract
Energy efficiency below a specific thermal design power (TDP) has become the main design goal for microprocessors across all market segments. Optimizing the usage of the available transistors within the TDP is a pending topic. Parallelism is the basic foundation for achieving the exascale level. While instruction-level and thread-level parallelism are embraced by developers, data-level parallelism is usually underutilized, despite its huge potential (e.g. single-instruction multiple-data execution). Companies are pushing the size of vector registers to double every 4 years. Intel’s AVX-512 (512-bit registers) and ARM’s SVE (up to 2048-bit registers) are examples of such trend. In this paper, we perform a scalability and energy efficiency analysis of AVX-512 using the ParVec benchmark suite. ParVec is extended to add support for AVX-512 as well as the newest versions of the GCC compiler . We use Intel’s Top–Down model to show the main bottlenecks of the architecture for each studied benchmark. Results show that the performance and energy improvements depend greatly on the fraction of code that can be vectorized . Energy improvements over scalar codes in a single-thread environment range from 2$$\times $$ for Streamcluster (worst) to 8$$\times $$ for Blackscholes (best).
Year
DOI
Venue
2020
10.1007/s11227-019-02840-7
The Journal of Supercomputing
Keywords
Field
DocType
Benchmarking, Vector, Efficiency, SIMD
Thermal design power,Suite,Efficient energy use,Computer science,Parallel computing,Scalar (physics),SIMD,Compiler,Benchmarking,Scalability
Journal
Volume
Issue
ISSN
76
3
0920-8542
Citations 
PageRank 
References 
0
0.34
0
Authors
3
Name
Order
Citations
PageRank
Juan Manuel Cebrian12410.19
Lasse Natvig210919.61
Magnus Jahre322620.50