Abstract | ||
---|---|---|
As the disparity between processor and memory speeds continues to grow, memory latency is becoming an increasingly important performance bottleneck. While software-controlled prefetching is an attractive technique for tolerating this latency, its success has been limited thus far to array-based numeric codes. In this paper, we expand the scope of automatic compiler-inserted prefetching to also include the recursive data structures commonly found in pointer-based applications. We propose three compiler-based prefetching schemes, and automate the most widely applicable scheme (greedy prefetching) in an optimizing research compiler. Our experimental results demonstrate that compiler-inserted prefetching can offer significant performance gains on both uniprocessors and large-scale shared-memory multiprocessors |
Year | DOI | Venue |
---|---|---|
1999 | 10.1109/12.752654 | IEEE Trans. Computers |
Keywords | Field | DocType |
significant performance gain,software-controlled prefetching,memory speed,recursive data structures,cache storage,memory latency,important performance bottleneck,compiler-inserted prefetching,greedy prefetching,applicable scheme,data structures,uniprocessors,shared-memory multiprocessors,compiler-based prefetching scheme,shared memory systems,pointer-based applications,optimizing research compiler,automatic compiler-inserted prefetching,program compilers,tree data structures,recursive data structure,application software,computer science,tree graphs,compiler optimization | Pointer (computer programming),Bottleneck,Data structure,Latency (engineering),CPU cache,Computer science,Parallel computing,Real-time computing,Compiler,Optimizing compiler,CAS latency | Journal |
Volume | Issue | ISSN |
48 | 2 | 0018-9340 |
Citations | PageRank | References |
24 | 1.53 | 16 |
Authors | ||
2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Chi-Keung Luk | 1 | 2537 | 116.49 |
Todd C. Mowry | 2 | 3021 | 253.75 |