Title
Feng shui of supercomputer memory: positional effects in DRAM and SRAM faults
Abstract
Several recent publications confirm that faults are common in high-performance computing systems. Therefore, further attention to the faults experienced by such computing systems is warranted. In this paper, we present a study of DRAM and SRAM faults in large high-performance computing systems. Our goal is to understand the factors that influence faults in production settings. We examine the impact of aging on DRAM, finding a marked shift from permanent to transient faults in the first two years of DRAM lifetime. We examine the impact of DRAM vendor, finding that fault rates vary by more than 4x among vendors. We examine the physical location of faults in a DRAM device and in a data center; contrary to prior studies, we find no correlations with either. Finally, we study the impact of altitude and rack placement on SRAM faults, finding that, as expected, altitude has a substantial impact on SRAM faults, and that top of rack placement correlates with 20% higher fault rate.
Year
DOI
Venue
2013
10.1145/2503210.2503257
High Performance Computing, Networking, Storage and Analysis
Keywords
Field
DocType
dram device,higher fault rate,positional effect,high-performance computing system,computing system,feng shui,supercomputer memory,large high-performance computing system,dram lifetime,sram fault,substantial impact,dram vendor,fault rate,memory,phase change
Dram,Supercomputer,Computer science,Parallel computing,Fault rate,Universal memory,Static random-access memory,Data center,CAS latency,Computing systems,Embedded system
Conference
ISSN
ISBN
Citations 
2167-4329
978-1-4503-2378-9
69
PageRank 
References 
Authors
1.91
13
5
Name
Order
Citations
PageRank
Vilas Sridharan151223.45
Jon Stearley265124.52
Nathan DeBardeleben349031.71
Sean Blanchard419013.20
Sudhanva Gurumurthi5123278.23