Abstract | ||
---|---|---|
Checkpointing techniques in parallel systems use de- pendency tracking and/or message logging to ensure that a system rolls back to a consistent state. Traditional de- pendency tracking in distributed shared memory (DSM) systems is expensive because of high communication fre- quency. In this paper we show that, if designed correctly, a DSM system only needs to consider dependencies due to the transfer of blocks of data, resulting in reduced depen- dency tracking overhead and reduced potential for rollback propagation. We develop an ownership timestamp scheme to tolerate the loss of block state information and develop a passive server model of execution where interactions be- tween processors are considered atomic. With our scheme, dependencies are significantly reduced compared to the tra- ditional message-passing model. |
Year | DOI | Venue |
---|---|---|
1994 | 10.1109/RELDIS.1994.336911 | Dana Point, CA |
Keywords | Field | DocType |
distributed memory systems,fault tolerant computing,block state information,checkpointing techniques,dependency tracking,interprocessor dependence,message logging,ownership timestamp scheme,parallel systems,passive server model,recoverable distributed shared memory,rollback propagation | Computer science,Parallel computing,Real-time computing,Redundancy (engineering),Fault tolerance,Timestamp,Concurrent computing,Distributed shared memory,Application software,Rollback,Message passing,Distributed computing | Conference |
Citations | PageRank | References |
16 | 0.97 | 21 |
Authors | ||
2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Janssens, B. | 1 | 16 | 0.97 |
W. Kent Fuchs | 2 | 1469 | 279.02 |