Title
Restructuring computations for temporal data cache locality
Abstract
Data access costs contribute significantly to the execution time of applications with complex data structures. A the latency of memory accesses becomes high relative to processor cycle times, application performance is increasingly limited by memory performance. In some situations it is useful to trade increased computation costs for reduced memory costs. The contributions of this paper are three-fold: we provide a detailed analysis of the memory performance of seven memory-intensive benchmarks; we describe Computation Regrouping, a source-level approach to improving the performance of memory-bound applications by increasing temporal locality to eliminate cache and TLB misses; and, we demonstrate significant performance improvement by applying Computation Regrouping to our suite of seven benchmarks. Using Computation Regrouping, we observe a geometric mean speedup of 1.90, with individual speedups ranging from 1.26 to 3.03. Most of this improvement comes from eliminating memory tall time.
Year
DOI
Venue
2003
10.1023/A:1024556711058
International Journal of Parallel Programming
Keywords
Field
DocType
execution time,reduced memory cost,significant performance improvement,application performance,memory access,memory performance,data structures,computation regrouping,data access cost,optimization.,temporal data,restructuring computation,cache locality,complex data structure,memory tall time,data structure,cycle time,optimization,geometric mean,data access,complex data
Interleaved memory,Uniform memory access,Locality of reference,Cache,Computer science,Parallel computing,Distributed memory,Theoretical computer science,Non-uniform memory access,Translation lookaside buffer,Speedup
Journal
Volume
Issue
ISSN
31
4
1573-7640
Citations 
PageRank 
References 
9
0.63
27
Authors
4
Name
Order
Citations
PageRank
Venkata K. Pingali112912.68
Sally A. Mckee21928152.59
Wilson C. Hsieh32532261.94
John B. Carter41785162.82