Title | ||
---|---|---|
Continuous runahead: Transparent hardware acceleration for memory intensive workloads. |
Abstract | ||
---|---|---|
Runahead execution pre-executes the application's own code to generate new cache misses. This pre-execution results in prefetch requests that are overwhelmingly accurate (95% in a realistic system configuration for the memory intensive SPEC CPU2006 benchmarks), much more so than a global history buffer (GHB) or stream prefetcher (by 13%/19%). However, we also find that current runahead techniques are very limited in coverage: they prefetch only a small fraction (13%) of all runahead-reachable cache misses. This is because runahead intervals are short and limited by the duration of each full-window stall. In this work, we explore removing the constraints that lead to these short intervals. We dynamically filter the instruction stream to identify the chains of operations that cause the pipeline to stall. These operations are renamed to execute speculatively in a loop and are then migrated to a Continuous Runahead Engine (CRE), a shared multi-core accelerator located at the memory controller. The CRE runs ahead with the chain continuously, increasing prefetch coverage to 70% of runahead-reachable cache misses. The result is a 43.3% weighted speedup gain on a set of memory intensive quad-core workloads and a significant reduction in system energy consumption. This is a 21.9% performance gain over the Runahead Buffer, a state-of-the-art runahead proposal and a 13.2%/13.5% gain over GHB/stream prefetching. When the CRE is combined with GHB prefetching, we observe a 23.5% gain over a baseline with GHB prefetching alone.
|
Year | DOI | Venue |
---|---|---|
2016 | 10.5555/3195638.3195712 | MICRO-49: The 49th Annual IEEE/ACM International Symposium on Microarchitecture
Taipei
Taiwan
October, 2016 |
Keywords | Field | DocType |
continuous runahead,transparent hardware acceleration,memory intensive workloads,realistic system configuration,memory intensive SPEC CPU2006 benchmarks,global history buffer,GHB,stream prefetcher,continuous runahead engine,CRE,shared multicore accelerator,memory controller,runahead buffer | Runahead,Cache,Computer science,Parallel computing,Real-time computing,Hardware acceleration,Instruction prefetch,Spec#,Energy consumption,Memory controller,Speedup | Conference |
ISSN | ISBN | Citations |
1072-4451 | 978-1-4503-4952-9 | 5 |
PageRank | References | Authors |
0.41 | 22 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Milad Hashemi | 1 | 81 | 5.95 |
Onur Mutlu | 2 | 9446 | 357.40 |
Yale N. Patt | 3 | 4947 | 566.20 |