Abstract | ||
---|---|---|
As CMOS feature sizes venture deep into the nanometer regime, wearout mechanisms including negative-bias temperature instability and time-dependent dielectric breakdown can severely reduce processor operating lifetimes and performance. This paper presents an introspective reliability management system, Maestro, to tackle reliability challenges in future chip multiprocessors (CMPs) head-on. Unlike traditional approaches, Maestro relies on low-level sensors to monitor the CMP as it ages (introspection). Leveraging this real-time assessment of CMP health, runtime heuristics identify wearout-centric job assignments (management). By exploiting the complementary effects of the natural heterogeneity (due to process variation and wearout) that exists in CMPs and the diversity found in system workloads, Maestro composes job schedules that intelligently control the aging process. Monte Carlo experiments show that Maestro significantly enhances lifetime reliability through intelligent wear-leveling, increasing the expected service life of a population of 16-core CMPs by as much as 38% compared to a naive, round-robin scheduler. Furthermore, in the presence of process variation, Maestro's wearout-centric scheduling outperformed both performance counter and temperature sensor based schedulers, achieving an order of magnitude more improvement in lifetime throughput – the amount of useful work done by a system prior to failure. |
Year | DOI | Venue |
---|---|---|
2010 | 10.1007/978-3-642-11515-8_15 | HiPEAC |
Keywords | Field | DocType |
maestro composes job schedule,cmp health,negative-bias temperature instability,lifetime throughput,process variation,lifetime reliability,reliability challenge,chip multiprocessors,system workloads,16-core cmps,introspective reliability management system,service life,job scheduling,management system,real time,intelligent control | Population,Scheduling (computing),Computer science,Parallel computing,Chip,Real-time computing,CMOS,Schedule,Heuristics,Process variation,Throughput,Embedded system | Conference |
Volume | ISSN | ISBN |
5952 | 0302-9743 | 3-642-11514-4 |
Citations | PageRank | References |
21 | 0.80 | 17 |
Authors | ||
4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Shuguang Feng | 1 | 306 | 12.96 |
Shantanu Gupta | 2 | 390 | 16.39 |
Amin Ansari | 3 | 361 | 15.88 |
Scott Mahlke | 4 | 4811 | 312.08 |