Title
Tiny Directory: Efficient Shared Memory in Many-Core Systems with Ultra-Low-Overhead Coherence Tracking
Abstract
The sparse directory has emerged as a critical component for supporting the shared memory abstraction in multiand many-core chip-multiprocessors. Recent research efforts have explored ways to reduce the number of entries in the sparse directory. These include tracking coherence of private regions at a coarse grain, not tracking blocks that belong to pages identified as private by the operating system (OS), and not tracking a subset of blocks that are speculated to be private by the hardware. These techniques require support for multi-grain coherence, assistance of OS, or broadcast-based recovery on sharing an untracked block that is wrongly speculated as private. In this paper, we design a robust minimally-sized sparse directory that can offer adequate performance while enjoying the simplicity, scalability, and OS-independence of traditional broadcast-free block-grain coherence. We begin our exploration with a naïve design that does not have a sparse directory and the location/sharers of a block are tracked by borrowing a portion of the block's lastlevel cache (LLC) data way. Such a design, however, lengthens the critical path from two transactions to three transactions (two hops to three hops) for the blocks that experience frequent shared read accesses. We address this problem by architecting a tiny sparse directory that dynamically identifies and tracks a selected subset of the blocks that experience a large volume of shared accesses. We augment the tiny directory proposal with an option of selectively spilling into the LLC space for tracking the coherence of the critical shared blocks that the tiny directory fails to accommodate. Detailed simulation-based study on a 128-core system with a large set of multi-threaded applications spanning scientific, general-purpose, and commercial computing shows that our coherence tracking proposal operating with1/32 × to 1/256 × sparse directories offers performance within a percentage of a traditional 2× sparse directory.
Year
DOI
Venue
2017
10.1109/HPCA.2017.24
2017 IEEE International Symposium on High Performance Computer Architecture (HPCA)
Keywords
Field
DocType
sparse directory,cache coherence,shared memory
Broadcasting,Shared memory,Instruction set,Directory,Computer science,Cache,Parallel computing,Computer network,Coherence (physics),Critical path method,Scalability,Distributed computing
Conference
ISSN
ISBN
Citations 
1530-0897
978-1-5090-4986-8
3
PageRank 
References 
Authors
0.38
31
2
Name
Order
Citations
PageRank
Sudhanshu Shukla171.10
Mainak Chaudhuri230018.86