Title
DaemonGuard: Enabling O/S-Orchestrated Fine-Grained Software-Based Selective-Testing in Multi-/Many-Core Microprocessors
Abstract
As technology scales deep into the sub-micron regime, transistors become less reliable. Future systems are widely predicted to suffer from considerable aging and wear-out effects. This ominous threat has urged system designers to develop effective run-time testing methodologies that can monitor and assess the systemu0027s health. In this work, we investigate the potential of online software-based functional testing at the granularity of individual microprocessor core components in multi-/many-core systems. While existing techniques monolithically test the entire core, our approach aims to reduce testing time by avoiding the over-testing of under-utilized units. To facilitate fine-grained testing, we introduce DaemonGuard, a framework that enables the real-time observation of individual sub-core modules and performs on-demand selective testing of only the modules that have recently been stressed. Moreover, we investigate the impact of the cache hierarchy on the testing process and we develop a cache-aware selective testing methodology that significantly expedites the execution of memory-intensive test programs. The monitoring and test-initiation process is orchestrated by a transparent, minimally-intrusive, and lightweight operating system process that observes the utilization of individual datapath components at run-time. We perform a series of experiments using a full-system, execution-driven simulation framework running a commodity operating system, real multi-threaded workloads, and test programs. Our results indicate that operating-system-assisted selective testing at the sub-core level leads to substantial savings in testing time and very low impact on system performance. Additionally, the cache-aware testing technique is shown to be very effective in exploiting the memory hierarchy to further minimize the testing time.
Year
DOI
Venue
2016
10.1109/TC.2015.2449840
IEEE Transactions on Computers
Keywords
Field
DocType
On-line testing, software-based self-testing, system reliability, adaptive testing
System integration testing,Memory hierarchy,Computer science,System testing,Parallel computing,Functional testing,Real-time computing,Software performance testing,White-box testing,Software reliability testing,Cloud testing,Embedded system
Journal
Volume
Issue
ISSN
PP
99
0018-9340
Citations 
PageRank 
References 
3
0.38
28
Authors
3
Name
Order
Citations
PageRank
Michael A. Skitsas1121.89
Chrysostomos Nicopoulos283550.37
Maria K. Michael317625.89