Title
Dealing with Layers of Obfuscation in Pseudo-Uniform Memory Architectures.
Abstract
Pseudo-Uniform Memory Architectures hide the memory's throughput bottlenecks and the network's latency differences in order to provide near-peak average throughput for computations on large datasets. This obviates the need for application-level partitioning and load balancing between NUMA domains but the performance of cross-core communication still depends on the actual placement of the involved variables and cores, which can result in significant variation within applications and between application runs. This paper analyses the pseudo-uniform memory latency on the Intel Xeon Phi Knights Corner processor, derives strategies for the optimised placement of important variables, and discusses the role of localised coordination in pUMA systems. For example, a basic cache line ping-pong benchmark showed a 3x speedup between adjacent cores. Therefore, pUMA systems combined with support for controlled placement of small datasets are an interesting option when processor-wide load balancing is difficult while localised coordination is feasible.
Year
DOI
Venue
2016
10.1007/978-3-319-58943-5_55
Lecture Notes in Computer Science
Field
DocType
Volume
Memory bank,Physical address,Computer science,CPU cache,Load balancing (computing),Parallel computing,Throughput,Obfuscation,Memory architecture,Cache coherence,Distributed computing
Conference
10104
ISSN
Citations 
PageRank 
0302-9743
0
0.34
References 
Authors
0
4
Name
Order
Citations
PageRank
Randolf Rotta11148.98
Robert Kuban200.68
Mark Simon Schöps300.34
Jörg Nolte42910.00