Title
Garbage collection for multicore NUMA machines
Abstract
Modern high-end machines feature multiple processor packages, each of which contains multiple independent cores and integrated memory controllers connected directly to dedicated physical RAM. These packages are connected via a shared bus, creating a system with a heterogeneous memory hierarchy. Since this shared bus has less bandwidth than the sum of the links to memory, aggregate memory bandwidth is higher when parallel threads all access memory local to their processor package than when they access memory attached to a remote package. This bandwidth limitation has traditionally limited the scalability of modern functional language implementations, which seldom scale well past 8 cores, even on small benchmarks. This work presents a garbage collector integrated with our strict, parallel functional language implementation, Manticore, and shows that it scales effectively on both a 48-core AMD Opteron machine and a 32-core Intel Xeon machine.
Year
DOI
Venue
2011
10.1145/1988915.1988929
Proceedings of the 2012 ACM SIGPLAN Workshop on Memory Systems Performance and Correctness
Keywords
DocType
Volume
modern high-end machine,modern functional language implementation,32-core intel xeon machine,heterogeneous memory hierarchy,shared bus,48-core amd opteron machine,multicore numa machine,bandwidth limitation,access memory,integrated memory controller,garbage collection,aggregate memory bandwidth,numa
Conference
abs/1105.2554
Citations 
PageRank 
References 
9
0.57
9
Authors
4
Name
Order
Citations
PageRank
Sven Auhagen190.57
Lars Bergstrom2815.23
Matthew Fluet329620.32
John H. Reppy489984.36