Title
An Architecture for Integrated Near-Data Processors.
Abstract
To increase the performance of data-intensive applications, we present an extension to a CPU architecture that enables arbitrary near-data processing capabilities close to the main memory. This is realized by introducing a component attached to the CPU system-bus and a component at the memory side. Together they support hardware-managed coherence and virtual memory support to integrate the near-data processors in a shared-memory environment. We present an implementation of the components, as well as a system-simulator, providing detailed performance estimations. With a variety of synthetic workloads we demonstrate the performance of the memory accesses, the mixed fine- and coarse-grained coherence mechanisms, and the near-data processor communication mechanism. Furthermore, we quantify the inevitable start-up penalty regarding coherence and data writeback, and argue that near-data processing workloads should access data several times to offset this penalty. A case study based on the Graph500 benchmark confirms the small overhead for the proposed coherence mechanisms and shows the ability to outperform a real CPU by a factor of two.
Year
DOI
Venue
2017
10.1145/3127069
TACO
Keywords
Field
DocType
Computer architecture, coherence, data locality, graph500, near-data processing, virtual memory
Computer architecture,Central processing unit,Uniform memory access,Computer science,Virtual memory,Parallel computing,Cache-only memory architecture,Coherence (physics),Real-time computing,Memory coherence,Non-uniform memory access,Graph500
Journal
Volume
Issue
ISSN
14
3
1544-3566
Citations 
PageRank 
References 
1
0.35
34
Authors
6
Name
Order
Citations
PageRank
Erik Vermij1314.63
Leandro Fiorin222917.10
Rik Jongerius3456.49
Christoph Hagleitner410820.84
Jan van Lunteren529720.56
Koen Bertels61365138.66