Title
Resilience to Various Failures for Read-mostly In-memory Data Structures
Abstract
As massively parallel processing (MPP) machines and their associated applications become larger, more work on resiliency is needed if those applications are to have a chance of running for significant lengths of time in the face of the expected component failure rates. This paper describes an approach for protecting large read-mostly in-memory data structures from various forms of failures by applying the concept of software erasure-correcting codes. A prototype library for this scheme was implemented on the Cray XMT and applied to a sample application. It is also portable to other global shared memory architectures that meet certain requirements, including the Cray XE.
Year
DOI
Venue
2012
10.1109/IPDPSW.2012.198
IPDPS Workshops
Keywords
Field
DocType
global shared memory architecture,expected component failure rate,cray xe,various failures,prototype library,certain requirement,parallel processing,read-mostly in-memory data structures,large read-mostly in-memory data,associated application,sample application,cray xmt,resilience,databases,registers,data structures,xenon,memory management,erasure codes,face
Psychological resilience,Data structure,Shared memory,Cray XMT,Massively parallel,Computer science,Parallel computing,Software,Memory management,Erasure code,Distributed computing
Conference
ISSN
Citations 
PageRank 
2164-7062
1
0.36
References 
Authors
0
4
Name
Order
Citations
PageRank
Larry Kaplan11388.52
Preston Briggs237932.43
Miles Ohlrich310.36
Will Leslie410.36