Title
Continuous runahead: Transparent hardware acceleration for memory intensive workloads.
Abstract
Runahead execution pre-executes the application's own code to generate new cache misses. This pre-execution results in prefetch requests that are overwhelmingly accurate (95% in a realistic system configuration for the memory intensive SPEC CPU2006 benchmarks), much more so than a global history buffer (GHB) or stream prefetcher (by 13%/19%). However, we also find that current runahead techniques are very limited in coverage: they prefetch only a small fraction (13%) of all runahead-reachable cache misses. This is because runahead intervals are short and limited by the duration of each full-window stall. In this work, we explore removing the constraints that lead to these short intervals. We dynamically filter the instruction stream to identify the chains of operations that cause the pipeline to stall. These operations are renamed to execute speculatively in a loop and are then migrated to a Continuous Runahead Engine (CRE), a shared multi-core accelerator located at the memory controller. The CRE runs ahead with the chain continuously, increasing prefetch coverage to 70% of runahead-reachable cache misses. The result is a 43.3% weighted speedup gain on a set of memory intensive quad-core workloads and a significant reduction in system energy consumption. This is a 21.9% performance gain over the Runahead Buffer, a state-of-the-art runahead proposal and a 13.2%/13.5% gain over GHB/stream prefetching. When the CRE is combined with GHB prefetching, we observe a 23.5% gain over a baseline with GHB prefetching alone.
Year
DOI
Venue
2016
10.5555/3195638.3195712
MICRO-49: The 49th Annual IEEE/ACM International Symposium on Microarchitecture Taipei Taiwan October, 2016
Keywords
Field
DocType
continuous runahead,transparent hardware acceleration,memory intensive workloads,realistic system configuration,memory intensive SPEC CPU2006 benchmarks,global history buffer,GHB,stream prefetcher,continuous runahead engine,CRE,shared multicore accelerator,memory controller,runahead buffer
Runahead,Cache,Computer science,Parallel computing,Real-time computing,Hardware acceleration,Instruction prefetch,Spec#,Energy consumption,Memory controller,Speedup
Conference
ISSN
ISBN
Citations 
1072-4451
978-1-4503-4952-9
5
PageRank 
References 
Authors
0.41
22
3
Name
Order
Citations
PageRank
Milad Hashemi1815.95
Onur Mutlu29446357.40
Yale N. Patt34947566.20