Title
StageNet: A Reconfigurable Fabric for Constructing Dependable CMPs
Abstract
CMOS scaling has long been a source of dramatic performance gains. However, semiconductor feature size reduction has resulted in increasing levels of operating temperatures and current densities. Given that most wearout mechanisms are highly dependent on these parameters, significantly higher failure rates are projected for future technology generations. Consequently, fault tolerance, which has traditionally been a subject of interest for high-end server markets, is now getting emphasis in the mainstream computing systems space. The popular solution for this has been the use of redundancy at a coarse granularity, such as dual/triple modular redundancy. In this work, we challenge the practice of coarse-granularity redundancy by identifying its inability to scale to high failure rate scenarios and investigating the advantages of finer-grained configurations. To this end, this paper presents and evaluates a highly reconfigurable CMP architecture, named as StageNet (SN), that is designed with reliability as its first-class design criteria. SN relies on a reconfigurable network of replicated processor pipeline stages to maximize the useful lifetime of a chip, gracefully degrading performance toward the end of life. Our results show that the proposed SN architecture can perform 40 percent more cumulative work compared to a traditional CMP over 12 years of its lifetime.
Year
DOI
Venue
2011
10.1109/TC.2010.205
Computers, IEEE Transactions
Keywords
Field
DocType
CMOS logic circuits,fault tolerant computing,microprocessor chips,pipeline processing,reconfigurable architectures,CMOS scaling,StageNet,coarse-granularity redundancy,dependable CMP construction,dual-triple modular redundancy,fault tolerance,finer-grained configurations,reconfigurable CMP architecture,reconfigurable fabric,replicated processor pipeline,semiconductor feature size reduction,CMP,Reliability,fault tolerance,multicore,wearout.
Pipeline transport,Computer science,Parallel computing,Triple modular redundancy,Failure rate,Chip,Real-time computing,Redundancy (engineering),Fault tolerance,Granularity,Multi-core processor,Embedded system
Journal
Volume
Issue
ISSN
60
1
0018-9340
Citations 
PageRank 
References 
15
0.80
22
Authors
4
Name
Order
Citations
PageRank
Shantanu Gupta1813.82
Shuguang Feng2150.80
Amin Ansari3251.38
Scott Mahlke44811312.08