Title
Optimizing shared data accesses in distributed-memory X10 systems
Abstract
Prior studies have established the performance impact of coherence protocols optimized for specific patterns of shared-data accesses in Non-Uniform-Memory-Architecture (NUMA) systems. First, this work incorporates a directory-based protocol into the runtime system of X10 — a Partitioned-Global-Address-Space (PGAS) programming language — to manage read-mostly, producer-consumer, stencil, and migratory variables. This protocol complements the existing X10Protocol, which keeps a unique copy of a shared variable and relies on message transfers for all remote accesses. The X10Protocol is effective to manage accumulator, write-mostly and general read-write variables. Then, it introduces a new shared-variable access-pattern profiler that is used by a new coherence-policy manager to decide which protocol should be used for each shared variable. The profiler can be run in both offline and online modes. An evaluation on a 128-core distributed-memory machine reveals that coordination between these protocols does not degrade performance on any of the applications studied, and achieves speedup in the range of 15% to 40% over X10Protocol. The performance is also comparable to carefully hand-written versions of the applications.
Year
DOI
Venue
2014
10.1109/HiPC.2014.7116889
International Conference on High Performance Computing
Field
DocType
ISSN
Data structure,Computer science,Stencil,Parallel computing,Distributed memory,Data diffusion machine,Distributed shared memory,Partitioned global address space,Operating system,Distributed computing,Runtime system,Speedup
Conference
1094-7256
ISBN
Citations 
PageRank 
978-1-4799-5975-4
1
0.36
References 
Authors
25
3
Name
Order
Citations
PageRank
Jeeva Paudel1222.82
Olivier Tardieu246232.13
José Nelson Amaral343640.18