Title
Efficient implementation of GPGPU synchronization primitives on CPUs
Abstract
The GPGPU model represents a style of execution where thousands of threads execute in a data-parallel fashion, with a large subset (typically 10s to 100s) needing frequent synchronization. As the GPGPU model evolves target both GPUs and CPUs as acceleration targets, thread synchronization becomes an important problem when running on CPUs. CPUs have little hardware support for synchronization and must be emulated in software, reducing application performance. This paper presents software techniques to implement the GPGPU synchronization primitives on CPUs, while maintaining application debug-ability. Performing limit studies using real hardware, we evaluate the potential performance benefits of an efficient barrier primitive.
Year
DOI
Venue
2010
10.1145/1787275.1787295
Conf. Computing Frontiers
Keywords
Field
DocType
gpgpu model evolves,real hardware,application debug-ability,efficient implementation,gpgpu model,hardware support,gpgpu synchronization primitive,potential performance benefit,frequent synchronization,application performance,thread synchronization,synchronization,gpgpu,multicore
Synchronization,Computer science,Parallel computing,Real-time computing,Thread (computing),Software,General-purpose computing on graphics processing units,Acceleration,Synchronization (computer science),Multi-core processor
Conference
Citations 
PageRank 
References 
2
0.40
4
Authors
5
Name
Order
Citations
PageRank
Jayanth Gummaraju132924.24
Ben Sander2472.62
Laurent Morichetti3472.62
Benedict Gaster441.11
Lee Howes51219.24