A unified memory network architecture for in-memory computing in commodity servers. - Citegraph

Paper Info

Title
A unified memory network architecture for in-memory computing in commodity servers.

Abstract
In-memory computing is emerging as a promising paradigm in commodity servers to accelerate data-intensive processing by striving to keep the entire dataset in DRAM. To address the tremendous pressure on the main memory system, discrete memory modules can be networked together to form a memory pool, enabled by recent trends towards richer memory interfaces (e.g. Hybrid Memory Cubes, or HMCs). Such an inter-memory network provides a scalable fabric to expand memory capacity, but still suffers from long multi-hop latency, limited bandwidth, and high power consumption---problems that will continue to exacerbate as the gap between interconnect and transistor performance grows. Moreover, inside each memory module, an intra-memory network (NoC) is typically employed to connect different memory partitions. Without careful design, the back-pressure inside the memory modules can further propagate to the inter-memory network to cause a performance bottleneck. To address these problems, we propose co-optimization of intra- and inter-memory network. First, we re-organize the intra-memory network structure, and provide a smart I/O interface to reuse the intra-memory NoC as the network switches for inter-memory communication, thus forming a unified memory network. Based on this architecture, we further optimize the inter-memory network for both high performance and lower energy, including a distance-aware selective compression scheme to drastically reduce communication burden, and a light-weight power-gating algorithm to turn off under-utilized links while guaranteeing a connected graph and deadlock-free routing. We develop an event-driven simulator to model our proposed architectures. Experiment results based on both synthetic traffic and real big-data workloads show that our unified memory network architecture can achieve 75.1% average memory access latency reduction and 22.1% total memory energy saving.

Year	DOI	Venue
2016	10.5555/3195638.3195673	MICRO-49: The 49th Annual IEEE/ACM International Symposium on Microarchitecture Taipei Taiwan October, 2016
Keywords	Field	DocType
unified memory network architecture,in-memory computing,commodity servers,data-intensive processing,DRAM,main memory system,discrete memory modules,memory pool,hybrid memory cubes,HMC,intermemory network,multihop latency,power consumption,intramemory network,NoC,memory network structure,network switches,distance-aware selective compression,light-weight power-gating algorithm,connected graph,deadlock-free routing,event-driven simulator,Big-Data workloads,average memory access latency reduction,total memory energy saving	Registered memory,Interleaved memory,Extended memory,Uniform memory access,Physical address,Computer science,Parallel computing,Computing with Memory,Real-time computing,Memory management,Flat memory model	Conference
ISSN	ISBN	Citations
1072-4451	978-1-4503-4952-9	1
PageRank	References	Authors
0.36	32	7

Authors (7 rows)

Cited by (1 rows)

References (32 rows)

Name	Order	Citations	PageRank
Jia Zhan	1	87	5.45
Itir Akgun	2	24	4.13
Jishen Zhao	3	638	38.51
Al Davis	4	986	54.47
Paolo Faraboschi	5	974	81.37
Yuangang Wang	6	1	0.36
Yuan Xie	7	6430	407.00

1