Title
Data prefetching and address pre-calculation through instruction pre-execution with two-step physical register deallocation
Abstract
This paper proposes an instruction pre-execution scheme that reduces latency and early scheduling of loads for a high performance processor. Our scheme exploits the difference between the available amount of instruction-level parallelism with an unlimited number of physical registers and that with an actual number of physical registers. We introduce a scheme called two-step physical register deallocation. Our scheme deallocates physical registers at the renaming stage as a first step, and eliminates pipeline stalls caused by a physical register shortage. Instructions wait for the final deallocation as a second step in the instruction window. While waiting, the scheme allows pre-execution of instructions. This enables prefetching of load data and early calculation of memory effective addresses. In particular, our execution-based scheme has the strength on prefetch of data with an irregular access pattern. Considering the strength of an automatic prefetcher for a regular access pattern, combining it with our scheme offers the best use of our scheme. The evaluation results show that the combined scheme significantly improve performance over a processor with an automatic prefetcher.
Year
DOI
Venue
2007
10.1145/1327171.1327175
MEDEA '07 Proceedings of the 2007 workshop on MEmory performance: DEaling with Applications, systems and architecture
Keywords
DocType
Citations 
actual number,physical register shortage,execution-based scheme,early scheduling,instruction pre-execution scheme,address pre-calculation,combined scheme,early calculation,automatic prefetcher,physical register,two-step physical register deallocation,vector processing,memory effect
Conference
7
PageRank 
References 
Authors
0.46
17
4
Name
Order
Citations
PageRank
Akihiro Yamamoto113526.84
Yusuke Tanaka270.80
Hideki Ando3707.90
Toshio Shimada4302.10