Title
A Matrix–Matrix Multiplication methodology for single/multi-core architectures using SIMD
Abstract
In this paper, a new methodology for speeding up Matrix–Matrix Multiplication using Single Instruction Multiple Data unit, at one and more cores having a shared cache, is presented. This methodology achieves higher execution speed than ATLAS state of the art library (speedup from 1.08 up to 3.5), by decreasing the number of instructions (load/store and arithmetic) and the data cache accesses and misses in the memory hierarchy. This is achieved by fully exploiting the software characteristics (e.g. data reuse) and hardware parameters (e.g. data caches sizes and associativities) as one problem and not separately, giving high quality solutions and a smaller search space.
Year
DOI
Venue
2014
https://doi.org/10.1007/s11227-014-1098-9
The Journal of Supercomputing
Keywords
Field
DocType
Matrix–Matrix Multiplication,Data cache,Cache associativity,Multi-core,SIMD,Memory management
Shared memory,Computer science,Matrix (mathematics),Parallel computing,SIMD,Multiplication,Memory management,Data cache,Multi-core processor,Matrix multiplication
Journal
Volume
Issue
ISSN
68
3
0920-8542
Citations 
PageRank 
References 
7
0.46
43
Authors
3
Name
Order
Citations
PageRank
Vasilios Kelefouras1275.28
Angeliki Kritikakou26612.85
Costas E Goutis318625.76