Abstract | ||
---|---|---|
An overview is given of the lessons learned from the introduction of multi-threading using OpenMP in tmLQCD. In particular, programming style, performance measurements, cache misses, scaling, thread distribution for hybrid codes, race conditions, the overlapping of communication and computation and the measurement and reduction of certain overheads are discussed. Performance measurements and sampling profiles are given for different implementations of the hopping matrix computational kernel. |
Year | DOI | Venue |
---|---|---|
2013 | 10.7892/boris.60597 | CoRR |
Field | DocType | Volume |
Kernel (linear algebra),Matrix (mathematics),Cache,Parallel computing,Programming style,Thread (computing),Sampling (statistics),Scaling,Computation,Particle physics,Physics | Journal | abs/1311.4521 |
ISSN | Citations | PageRank |
PoS(LATTICE 2013)416 | 0 | 0.34 |
References | Authors | |
0 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Albert Deuzeman | 1 | 0 | 0.68 |
Karl Jansen | 2 | 33 | 7.43 |
B. Kostrzewa | 3 | 0 | 0.34 |
Carsten Urbach | 4 | 9 | 2.06 |