Abstract | ||
---|---|---|
In-memory computing is emerging as a promising paradigm in commodity servers to accelerate data-intensive processing by striving to keep the entire dataset in DRAM. To address the tremendous pressure on the main memory system, discrete memory modules can be networked together to form a memory pool, enabled by recent trends towards richer memory interfaces (e.g. Hybrid Memory Cubes, or HMCs). Such an inter-memory network provides a scalable fabric to expand memory capacity, but still suffers from long multi-hop latency, limited bandwidth, and high power consumption---problems that will continue to exacerbate as the gap between interconnect and transistor performance grows. Moreover, inside each memory module, an intra-memory network (NoC) is typically employed to connect different memory partitions. Without careful design, the back-pressure inside the memory modules can further propagate to the inter-memory network to cause a performance bottleneck.
To address these problems, we propose co-optimization of intra- and inter-memory network. First, we re-organize the intra-memory network structure, and provide a smart I/O interface to reuse the intra-memory NoC as the network switches for inter-memory communication, thus forming a unified memory network. Based on this architecture, we further optimize the inter-memory network for both high performance and lower energy, including a distance-aware selective compression scheme to drastically reduce communication burden, and a light-weight power-gating algorithm to turn off under-utilized links while guaranteeing a connected graph and deadlock-free routing. We develop an event-driven simulator to model our proposed architectures. Experiment results based on both synthetic traffic and real big-data workloads show that our unified memory network architecture can achieve 75.1% average memory access latency reduction and 22.1% total memory energy saving.
|
Year | DOI | Venue |
---|---|---|
2016 | 10.5555/3195638.3195673 | MICRO-49: The 49th Annual IEEE/ACM International Symposium on Microarchitecture
Taipei
Taiwan
October, 2016 |
Keywords | Field | DocType |
unified memory network architecture,in-memory computing,commodity servers,data-intensive processing,DRAM,main memory system,discrete memory modules,memory pool,hybrid memory cubes,HMC,intermemory network,multihop latency,power consumption,intramemory network,NoC,memory network structure,network switches,distance-aware selective compression,light-weight power-gating algorithm,connected graph,deadlock-free routing,event-driven simulator,Big-Data workloads,average memory access latency reduction,total memory energy saving | Registered memory,Interleaved memory,Extended memory,Uniform memory access,Physical address,Computer science,Parallel computing,Computing with Memory,Real-time computing,Memory management,Flat memory model | Conference |
ISSN | ISBN | Citations |
1072-4451 | 978-1-4503-4952-9 | 1 |
PageRank | References | Authors |
0.36 | 32 | 7 |
Name | Order | Citations | PageRank |
---|---|---|---|
Jia Zhan | 1 | 87 | 5.45 |
Itir Akgun | 2 | 24 | 4.13 |
Jishen Zhao | 3 | 638 | 38.51 |
Al Davis | 4 | 986 | 54.47 |
Paolo Faraboschi | 5 | 974 | 81.37 |
Yuangang Wang | 6 | 1 | 0.36 |
Yuan Xie | 7 | 6430 | 407.00 |