Fault Tolerance in Distributed Shared Memory Multiprocessors - Citegraph

Paper Info

Title
Fault Tolerance in Distributed Shared Memory Multiprocessors

Abstract
Massively parallel systems represent a new challenge for fault tolerance. The designers of such systems cannot expect that no parts of the system will fail. With the significant increase in the complexity and number of components the chance of a single or multiple failure is no longer negligible. It is clear that the redundancy, reconfigurability and diagnosis techniques must be incorporated at the design stage itself and not as a subsequent add-on. In this paper we discuss the fault tolerance techniques developed for MEMSY, a massively parallel architecture. These techniques can, in principle, be easily transferred to other distributed shared memory multiprocessors.

Year	DOI	Venue
1993	10.1007/3-540-57307-0_24	Parallel Computer Architectures
Keywords	Field	DocType
fault tolerance,shared memory multiprocessors,fault tolerant,distributed shared memory	Supercomputer architecture,Uniform memory access,Shared memory,Massively parallel,Computer science,Parallel computing,Distributed memory,Cache-only memory architecture,Fault tolerance,Distributed shared memory,Distributed computing	Conference
ISBN	Citations	PageRank
3-540-57307-0	6	0.73
References	Authors
13	8

Authors (8 rows)

Cited by (6 rows)

References (13 rows)

Name	Order	Citations	PageRank
Mario Dal Cin	1	282	40.09
A. Crygier	2	15	1.72
H. Hessenauer	3	15	1.72
U. Hildebrand	4	18	2.26
J. Hönig	5	13	1.28
Wolfgang Hohl	6	65	9.25
Edgar Michel	7	7	1.07
András Pataricza	8	514	55.25

1