Title
Effective SIMD vectorization for intel Xeon Phi coprocessors
Abstract
AbstractEfficiently exploiting SIMD vector units is one of the most important aspects in achieving high performance of the application code running on Intel Xeon Phi coprocessors. In this paper, we present several effective SIMD vectorization techniques such as less-than-full-vector loop vectorization, Intel MIC specific alignment optimization, and small matrix transpose/multiplication 2D vectorization implemented in the IntelC/C++ and Fortran production compilers for Intel Xeon Phi coprocessors. A set of workloads from several application domains is employed to conduct the performance study of our SIMD vectorization techniques. The performance results show that we achieved up to 12.5x performance gain on the Intel Xeon Phi coprocessor. We also demonstrate a 2000x performance speedup from the seamless integration of SIMD vectorization and parallelization.
Year
DOI
Venue
2015
10.1155/2015/269764
Periodicals
Field
DocType
Volume
MMX,SSE2,Computer science,Xeon Phi,Parallel computing,Vectorization (mathematics),SIMD,Compiler,Coprocessor,Speedup
Journal
2015
Issue
ISSN
Citations 
1
1058-9244
6
PageRank 
References 
Authors
0.55
10
8
Name
Order
Citations
PageRank
Xinmin Tian159652.92
Hideki Saito217714.88
Serguei Preis3281.86
Eric N. Garcia4291.79
Sergey Kozhukhov5422.21
Matt Masten6292.13
Aleksei G. Cherkasov7342.19
Nikolay Panchenko8342.53