Title
Data locality and load balancing in COOL
Abstract
Large-scale shared memory multiprocessors typically support a multilevel memory hierarchy consisting of per-processor caches, a local portion of shared memory, and remote shared memory. On such machines, the performance of parallel programs is often limited by the high latency of remote memory references. In this paper we explore how knowledge of the underlying memory hierarchy can be used to schedule computation and distribute data structures, and thereby improve data locality. Our study is done in the context of COOL, a concurrent object-oriented language developed at Stanford. We develop abstractions for the programmer to supply optional information about the data reference patterns of the program. This information is used by the runtime system to distribute tasks and objects so that the tasks execute close (in the memory hierarchy) to the objects they reference.We demonstrate the effectiveness of these techniques by applying them to several applications chosen from the SPLASH parallel benchmark suite. Our experience suggests that improving data locality can be simple through a combination of programmer abstractions and smart runtime scheduling.
Year
DOI
Venue
1993
10.1145/155332.155358
PPOPP
Keywords
Field
DocType
logic programming,load balance,shared memory
Interleaved memory,Uniform memory access,Programming language,Memory hierarchy,Shared memory,Computer science,Parallel computing,Distributed memory,Data diffusion machine,Memory map,Distributed shared memory,Distributed computing
Conference
Volume
Issue
ISSN
28
7
0362-1340
ISBN
Citations 
PageRank 
0-89791-589-5
37
3.32
References 
Authors
10
3
Name
Order
Citations
PageRank
Rohit Chandra113214.47
Anoop Gupta26610867.50
John L. Hennessy33760911.05