Abstract | ||
---|---|---|
Relaxed memory consistency models tolerate increased memory access latency in both hardware and software dis- tributed shared memory systems. In recoverable systems, relaxing consistency has the added benefit of reducing the number of checkpoints needed to avoid rollback propa- gation. In this paper, we introduce new checkpointing algorithms that take advantage of relaxed consistency to reduce the performance overhead of checkpointing. We also introduce a scheme based on lazy relaxed consistency, that reduces both checkpointing overhead and the over- head of avoiding error propagation in systems with error latency. We use multiprocessor address traces to evalu- ate the relaxed consistency approach to checkpointing with distributed shared memory. In this paper we show that, in shared-memory computer systems which require recoverability from transient node errors, relaxing consistency has the added benefit of de- creasing the performance overhead of independent check- pointing and rollback recovery. We present checkpointing algorithms that take advantage of the conditions for relaxe d consistency to reduce the minimum number of checkpoints required for correct operation. We also show how the checkpointing scheme reduces the cost of avoiding error propagation in systems with lazy relaxed consistency. We use multiprocessor address traces to evaluate the techniqu es by trace-driven simulation. |
Year | DOI | Venue |
---|---|---|
1993 | 10.1109/FTCS.1993.627319 | FTCS-23 - TWENTY-THIRD INTERNATIONAL SYMPOSIUM ON FAULT-TOLERANT COMPUTING : DIGEST OF PAPERS |
Keywords | Field | DocType |
parallel processing,error propagation,distributed processing,propagation,computer programming,workstations,distributed computing,shared memory,algorithms,consistency,parallel programming,recovery,hardware,message passing,distributed shared memory,computer architecture | Latency (engineering),Computer science,Parallel computing,Multiprocessing,Consistency model,Distributed shared memory,Rollback,Message passing,Computer programming,Cache coherence,Distributed computing | Conference |
Citations | PageRank | References |
30 | 1.81 | 24 |
Authors | ||
2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Bob Janssens | 1 | 30 | 1.81 |
W. Kent Fuchs | 2 | 1469 | 279.02 |