Title
Using Low Cost Erasure and Error Correction Schemes to Improve Reliability of Commodity DRAM Systems.
Abstract
Most server-grade systems provide Chipkill-Correct error protection at the expense of power and performance. In this paper we present a low overhead solution to improving the reliability of commodity DRAM systems with no change in the existing memory architecture. Specifically, we propose five erasure and error correction (E-ECC) schemes that provide at least Chipkill-Correct protection for x4 (Schemes 1, 2 and 3), x8 (Scheme 4) and x16 (Scheme 5) DRAM systems. All schemes have superior error correction performance due to the use of strong symbol-based codes. Synthesis results in 28 nm node show that the decoding latency of these codes is negligible compared to the DRAM access latency. In addition, we make use of erasure codes to extend the lifetime of the DRAM systems. Specifically, once a chip is marked faulty due to persistent errors, all E-ECC schemes correct erasures due to that faulty chip and also correct an additional random error in a second chip. Evaluation with SPEC2006 workloads show that compared to x4 Chipkill-Correct schemes, Scheme 5 has the highest IPC improvement (mean of 7 percent) and Scheme 4 has the largest power reduction (mean of 18 percent) and the largest increase in energy efficiency (mean of 25 percent).
Year
DOI
Venue
2016
10.1109/TC.2016.2550455
IEEE Trans. Computers
Keywords
Field
DocType
Random access memory,Error correction codes,DRAM memory,Error correction,Decoding
Dram,Soft error,Computer science,Parallel computing,Real-time computing,Chip,Error detection and correction,Decoding methods,Erasure code,Memory architecture,Erasure
Journal
Volume
Issue
ISSN
65
12
0018-9340
Citations 
PageRank 
References 
3
0.37
15
Authors
7
Name
Order
Citations
PageRank
Hsing-Min Chen1142.48
Supreet Jeloka2416.41
Akhil Arunkumar3554.12
David Blaauw48916823.47
Carole-Jean Wu5312.69
Trevor Mudge66139659.74
Chaitali Chakrabarti71978184.17