Title
Scalable Store-Load Forwarding via Store Queue Index Prediction
Abstract
Conventional processors use a fully-associative store queue (SQ) to implement store-load forwarding. Associative search latency does not scale well to capacities and bandwidths required by wide-issue, large window processors. In this work, we improve SQ scalability by implementing store-load forwarding using speculative indexed access rather than associative search. Our design uses prediction to identify the single SQ entry from which each dynamic load is most likely to forward. When a load executes, it either obtains its value from the predicted SQ entry (if the address of the entry matches the load address) or the data cache (otherwise). A forwarding mis-prediction驴detected by pre-commit filtered load re-execution-results in a pipeline flush. SQ index prediction is generally accurate, but for some loads it cannot reliably identify a single SQ entry. To avoid flushes on these difficult loads while keeping the single-SQ-access-per-load invariant, a second predictor delays difficult loads until all but the youngest of their "candidate" stores have committed. Our predictors are inspired by store-load dependence predictors for load scheduling (Store Sets and the Exclusive Collision Predictor) and unify load scheduling and forwarding. Experiments on the SPEC2000 and MediaBench benchmarks show that on an 8-way issue processor with a 512-entry reorder buffer, our technique performs within 3.3% of an ideal associative SQ (same latency as the data cache) and either matches or exceeds the performance of a realistic associative SQ (slower than data cache) on 31 of 47 programs.
Year
DOI
Venue
2005
10.1109/MICRO.2005.29
MICRO
Keywords
Field
DocType
dynamic load,store-load forwarding,scalable store-load forwarding,difficult load,sq index prediction,single sq entry,data cache,load address,sq entry,store queue index prediction,sq scalability,load scheduling,vliw,indexation
Computer science,Dynamic load testing,Latency (engineering),Very long instruction word,Parallel computing,Queue,Real-time computing,Collision,Binary translation,Re-order buffer,Scalability
Conference
ISSN
ISBN
Citations 
1072-4451
0-7695-2440-0
32
PageRank 
References 
Authors
1.06
17
3
Name
Order
Citations
PageRank
tingting sha1833.20
Milo M. K. Martin22677125.22
Amir Roth375741.65