Title
Relaxing Consistency In Recoverable Distributed Shared-Memory
Abstract
Relaxed memory consistency models tolerate increased memory access latency in both hardware and software dis- tributed shared memory systems. In recoverable systems, relaxing consistency has the added benefit of reducing the number of checkpoints needed to avoid rollback propa- gation. In this paper, we introduce new checkpointing algorithms that take advantage of relaxed consistency to reduce the performance overhead of checkpointing. We also introduce a scheme based on lazy relaxed consistency, that reduces both checkpointing overhead and the over- head of avoiding error propagation in systems with error latency. We use multiprocessor address traces to evalu- ate the relaxed consistency approach to checkpointing with distributed shared memory. In this paper we show that, in shared-memory computer systems which require recoverability from transient node errors, relaxing consistency has the added benefit of de- creasing the performance overhead of independent check- pointing and rollback recovery. We present checkpointing algorithms that take advantage of the conditions for relaxe d consistency to reduce the minimum number of checkpoints required for correct operation. We also show how the checkpointing scheme reduces the cost of avoiding error propagation in systems with lazy relaxed consistency. We use multiprocessor address traces to evaluate the techniqu es by trace-driven simulation.
Year
DOI
Venue
1993
10.1109/FTCS.1993.627319
FTCS-23 - TWENTY-THIRD INTERNATIONAL SYMPOSIUM ON FAULT-TOLERANT COMPUTING : DIGEST OF PAPERS
Keywords
Field
DocType
parallel processing,error propagation,distributed processing,propagation,computer programming,workstations,distributed computing,shared memory,algorithms,consistency,parallel programming,recovery,hardware,message passing,distributed shared memory,computer architecture
Latency (engineering),Computer science,Parallel computing,Multiprocessing,Consistency model,Distributed shared memory,Rollback,Message passing,Computer programming,Cache coherence,Distributed computing
Conference
Citations 
PageRank 
References 
30
1.81
24
Authors
2
Name
Order
Citations
PageRank
Bob Janssens1301.81
W. Kent Fuchs21469279.02