Title
Ha-Psls: A Highly Available Parallel Single-Level Store System
Abstract
Parallel single-level store (PSLS) systems integrate a shared virtual memory and a parallel file system. They provide programmers with a global address space including both memory and file data. PSLS systems implemented in a cluster thus represent a natural support for long-running parallel applications, combining both the natural shared memory programming model and a large and efficient file system.However, the need to tolerate failures in such a system increases with the size of applications. In this paper we present a highly-available parallel single level store system (HA-PSLS), which smoothly integrates a backward error recovery high-availability mechanism into a PSLS system. Our system is able to tolerate multiple transient failures, a single permanent failure, and power cut failures affecting the whole cluster, without requiring any specialized hardware. For this purpose, HA-PSLS relies on a high degree of integration (and reusability) of high-availability and standard features. A prototype integrating our high-availability support has been implemented and we show some performance results in the paper. Copyright (C) 2003 John Wiley Sons, Ltd.
Year
DOI
Venue
2003
10.1002/cpe.739
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE
Keywords
DocType
Volume
parallel single-level store, high-availability, fault tolerance, checkpointing, replication, integration, parallel file systems, shared virtual memory
Journal
15
Issue
ISSN
Citations 
10
1532-0626
0
PageRank 
References 
Authors
0.34
15
2
Name
Order
Citations
PageRank
Anne-Marie Kermarrec16649453.63
Christine Morin222626.78