Recoverable Mutual Exclusion Under System-Wide Failures. - Citegraph

Paper Info

Title
Recoverable Mutual Exclusion Under System-Wide Failures.

Abstract
Recoverable mutual exclusion (RME) is a variation on the classic mutual exclusion (ME) problem that allows processes to crash and recover. The time complexity of RME algorithms is quantified in the same way as for ME, namely by counting remote memory references -- expensive memory operations that traverse the processor-to-memory interconnect. Prior work has established that the RMR complexity of the RME problem for n processes is Θ(log n) for the class of algorithms that use read/write registers and single-word comparison primitives such as Compare-And-Swap (Golab and Ramaraju 2016), O(log n / log log n) for the class of algorithms that use read/write registers and additional single-word read-modify-primitives such as Fetch-And-Store (Golab and Hendler 2017), and Θ(1) for the class of algorithms that use read/write registers and specialized double-word read-modify-write primitives (Golab and Hendler 2017). These complexity bounds hold in a model of computation where processes may fail independently, and where a process that fails while accessing the mutex is required to recover eventually. This body of work leaves open two important questions: (i) what is the tight bound on the RMR complexity of RME for the class of algorithms that use read/write registers and commonly supported single-word read-modify-primitives; and (ii) how is the RMR complexity of RME affected by variations in the failure model? This paper answers both questions partially by showing that RME can be solved using O(1) RMRs per passage in the worst case in a model where failures are system-wide (i.e., all processes crash simultaneously), and processes receive additional information from the environment regarding the occurrence of the failure. The upper bound algorithm we present relies crucially on a novel RMR-efficient barrier that processes use to synchronize recovery actions after each failure. The barrier uses read/write registers and single-word Compare-And-Swap only. Additionally, we present a transformation that can add properties such as critical section re-entry and a strong notion of starvation freedom to any RME algorithm while preserving its asymptotic RMR complexity.

Year	DOI	Venue
2018	10.1145/3212734.3212755	PODC '18: ACM Symposium on Principles of Distributed Computing Egham United Kingdom July, 2018
Keywords	Field	DocType
Mutual exclusion, recovery, fault tolerance, concurrency, shared memory, multi-core algorithms, non-volatile main memory, persistent data structures	Semaphore,Shared memory,Computer science,Upper and lower bounds,Concurrency,Critical section,Theoretical computer science,Model of computation,Time complexity,Mutual exclusion	Conference
ISBN	Citations	PageRank
978-1-4503-5795-1	0	0.34
References	Authors
19	2

Authors (2 rows)

Cited by (0 rows)

References (19 rows)

Name	Order	Citations	PageRank
Wojciech Golab	1	210	17.22
Danny Hendler	2	726	44.54

1