Title
Multipath execution: opportunities and limits
Abstract
Even sophisticated branch-prediction techniques necessarily suf- fer some mispredictions, and even relatively small mispredict rates hurt performance substantially in current-generation processors. In this paper, we investigate schemes for improving performance in the face of imperfect branch predictors by having the processor si- multaneously execute code from both the taken and not-taken out- comes of a branch. This paper presents data regarding the limits of multipath ex- ecution, considers fetch-bandwidth needs for multipath execution, and discusses various dynamic confidence-prediction schemes that gauge the likelihood of branch mispredictions. Our evaluations consider executing along several (2-8) paths at once. Using 4 paths and a relatively simple confidence predictor, multipath execution garners speedups of up to 30% compared to the single-path case, with an average speedup of 14.4% for the SPECint suite. While as- sociated increases in instruction-fetch-bandwidth requirements are not too surprising, a less expected result is the significance of hav- ing a separate return-address stack for each forked path. Overall, our results indicate that multipath execution offers significant im- provements over single-path performance, and could be especially useful when combined with multithreading so that hardware costs can be amortized over both approaches. Modern processors employ a variety of sophisticated branch- prediction schemes to avoid delay penalties imposed by conditional- branch resolution. Correct predictions of branch outcomes, some- times accompanied by predictions of the target's address, can al- most eliminate these penalties. But mispredictions that remain still cause serious disruptions in instruction flow, as the processor wastes time following the wrong path until the misprediction is detected. As issue widths increase and processor pipelines deepen, the mis- prediction penalty increases. Many programs suffer a substantial number of mispredictions. Since the delays caused by conditional-branch mispredictions re- main a serious problem and more sophisticated prediction tech- niques still encounter information-theoretic limits (2), we inves- tigate a different kind of remedy: the simultaneous execution of both the taken and not-taken instruction sequences following a con- ditional branch, with cancelation of the one that turns out to be incorrect when the branch is finally resolved. Because additional
Year
DOI
Venue
1998
10.1145/277830.277854
International Conference on Supercomputing 2006
Keywords
Field
DocType
multipath execution,branch prediction
Multipath propagation,Computer architecture,Computer science,Parallel computing,Real-time computing
Conference
ISBN
Citations 
PageRank 
0-89791-998-X
25
1.59
References 
Authors
8
4
Name
Order
Citations
PageRank
Pritpal S. Ahuja114417.85
Kevin Skadron26188384.18
Margaret Martonosi38647715.76
Douglas W. Clark41009266.88