Title | ||
---|---|---|
Tiny Directory: Efficient Shared Memory in Many-Core Systems with Ultra-Low-Overhead Coherence Tracking |
Abstract | ||
---|---|---|
The sparse directory has emerged as a critical component for supporting the shared memory abstraction in multiand many-core chip-multiprocessors. Recent research efforts have explored ways to reduce the number of entries in the sparse directory. These include tracking coherence of private regions at a coarse grain, not tracking blocks that belong to pages identified as private by the operating system (OS), and not tracking a subset of blocks that are speculated to be private by the hardware. These techniques require support for multi-grain coherence, assistance of OS, or broadcast-based recovery on sharing an untracked block that is wrongly speculated as private. In this paper, we design a robust minimally-sized sparse directory that can offer adequate performance while enjoying the simplicity, scalability, and OS-independence of traditional broadcast-free block-grain coherence. We begin our exploration with a naïve design that does not have a sparse directory and the location/sharers of a block are tracked by borrowing a portion of the block's lastlevel cache (LLC) data way. Such a design, however, lengthens the critical path from two transactions to three transactions (two hops to three hops) for the blocks that experience frequent shared read accesses. We address this problem by architecting a tiny sparse directory that dynamically identifies and tracks a selected subset of the blocks that experience a large volume of shared accesses. We augment the tiny directory proposal with an option of selectively spilling into the LLC space for tracking the coherence of the critical shared blocks that the tiny directory fails to accommodate. Detailed simulation-based study on a 128-core system with a large set of multi-threaded applications spanning scientific, general-purpose, and commercial computing shows that our coherence tracking proposal operating with1/32 × to 1/256 × sparse directories offers performance within a percentage of a traditional 2× sparse directory. |
Year | DOI | Venue |
---|---|---|
2017 | 10.1109/HPCA.2017.24 | 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA) |
Keywords | Field | DocType |
sparse directory,cache coherence,shared memory | Broadcasting,Shared memory,Instruction set,Directory,Computer science,Cache,Parallel computing,Computer network,Coherence (physics),Critical path method,Scalability,Distributed computing | Conference |
ISSN | ISBN | Citations |
1530-0897 | 978-1-5090-4986-8 | 3 |
PageRank | References | Authors |
0.38 | 31 | 2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Sudhanshu Shukla | 1 | 7 | 1.10 |
Mainak Chaudhuri | 2 | 300 | 18.86 |