A Simple Cache Coherence Scheme For Integrated Cpu-Gpu Systems - Citegraph

Paper Info

Title
A Simple Cache Coherence Scheme For Integrated Cpu-Gpu Systems

Abstract
This paper presents a novel approach to accelerate applications running on integrated CPU-GPU systems. Many integrated CPUGPU systems use cache-coherent shared memory to communicate. For example, after CPU produces data for GPU, the GPU may pull the data into its cache when it accesses the data. In such a pull-based approach, data resides in a shared cache until the GPU accesses it, resulting in long load latency on a first GPU access to a cache line. In this work, we propose a new, push-based, coherence mechanism that explicitly exploits the CPU and GPU producer-consumer relationship by automatically moving data from CPU to GPU last-level cache. The proposed mechanism results in a dramatic reduction of the GPU L2 cache miss rate in general, and a consequent increase in overall performance. Our experiments show that the proposed scheme can increase performance by up to 37%, with typical improvements in the 5-7% range. We find that even when tested applications do not benefit from the proposed approach, their performance does not decrease with our technique. While we demonstrate how the proposed scheme can co-exist with traditional cache coherence mechanisms, we argue that it could also be used as a simpler replacement for existing protocols.

Year	DOI	Venue
2020	10.1109/DAC18072.2020.9218664	PROCEEDINGS OF THE 2020 57TH ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC)
Keywords	DocType	ISSN
Cache coherence, GPU, Integrated CPU/GPU	Conference	0738-100X
Citations	PageRank	References
0	0.34	0
Authors
4

Authors (4 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Ardhi Wiratama Baskara Yudha	1	1	1.71
Reza Pulungan	2	79	8.84
Henry Hoffmann	3	1772	95.10
Yan Solihin	4	2057	111.56

1