Title
Robust Parallel Computations through Randomization
Abstract
In this paper we present an efficient general simulation strategy for computations designed for fully operational BSP machines of n ideal processors, on n-processor dynamic-fault-prone BSP machines. The fault occurrences are fail- stop and fully dynamic, i.e., they are allowed to happen on-line at any point of the computation, subject to the constraint that the total number of faulty processors may never exceed a known fraction. The computational paradigm can be exploited for robust computations over virtual parallel settings with a volatile underlying infrastructure, such as a NETWORK OF WORKSTATIONS(where workstations may be taken out of the virtual parallel machine by their owner). Our simulation strategy is Las Vegas (i.e., it may never fail, due to backtrack- ing operations to robustly stored instances of the computation, in case of locally unrecoverable situations). It adopts an adaptive balancing scheme of the workload among the currently live processors of the BSP machine. Our strategy is efficient in the sense that, compared with an optimal off-line adversarial computation under the same sequence of fault occurrences, it achieves
Year
DOI
Venue
2000
10.1007/s002240010009
Theory of Computing Systems \/ Mathematical Systems Theory
Keywords
Field
DocType
Failure Probability,Fault Tolerance,Storage Scheme,Virtual Processor,Fault Occurrence
Computer science,Workload,Parallel computing,Workstation,Fault tolerance,Virtual Processor,Fault occurrence,Computation,Distributed computing
Journal
Volume
Issue
ISSN
33
5/6
1433-0490
Citations 
PageRank 
References 
5
0.42
26
Authors
4
Name
Order
Citations
PageRank
Spyros Kontogiannis125928.04
Grammati E. Pantziou236642.33
Paul G. Spirakis32222299.05
Moti Yung4120801152.41