Title
Shared memory computing on clusters with symmetric multiprocessors and system area networks
Abstract
Cashmere is a software distributed shared memory (S-DSM) system designed for clusters of server-class machines. It is distinguished from most other S-DSM projects by (1) the effective use of fast user-level messaging, as provided by modern system-area networks, and (2) a “two-level” protocol structure that exploits hardware coherence within multiprocessor nodes. Fast user-level messages change the tradeoffs in coherence protocol design; they allow Cashmere to employ a relatively simple directory-based coherence protocol. Exploiting hardware coherence within SMP nodes improves overall performance when care is taken to avoid interference with inter-node software coherence.We have implemented Cashmere on a Compaq AlphaServer/Memory Channel cluster, an architecture that provides fast user-level messages. Experiments indicate that a one-level, version of the Cashmere protocol provides performance comparable to, or slightly better than, that of TreadMarks' lazy release consistency. Comparisons to Compaq's Shasta protocol also suggest that while fast user-level messages make finer-grain software DSMs competitive, VM-based systems continue to outperform software-based access control for applications without extensive fine-grain sharing.Within the family of Cashmere protocols, we find that leveraging intranode hardware coherence provides a 37% performance advantage over a more straightforward one-level implementation. Moreover, contrary to our original expectations, noncoherent hardware support for remote memory writes, total message ordering, and broadcast, provide comparatively little in the way of additional benefits over just fast messaging for our application suite.
Year
DOI
Venue
2005
10.1145/1082469.1082472
ACM Trans. Comput. Syst.
Keywords
Field
DocType
system design,shared memory,distributed shared memory,access control
Shared memory,Computer science,Distributed memory,Computer network,Real-time computing,Multiprocessing,Software,Distributed shared memory,TreadMarks,Multi-channel memory architecture,Release consistency,Distributed computing
Journal
Volume
Issue
ISSN
23
3
0734-2071
Citations 
PageRank 
References 
7
0.52
48
Authors
7
Name
Order
Citations
PageRank
Leonidas I. Kontothanassis123925.05
Robert Stets244467.87
Galen C. Hunt388961.08
Umit Rencuzogullari4282.18
Gautam Altekar533018.06
Sandhya Dwarkadas63504257.31
Michael L. Scott72843248.01