Title
Extending the Performance and Energy-Efficiency of Shared Memory Multicores with Nanophotonic Technology
Abstract
As the number of cores increases exponentially on a single chip, the design and integration of both the on-chip network facilitating intercore communication, and the cache coherence protocol for enabling shared memory programming have become critical for improved energy-efficiency and overall chip performance. With traditional metal interconnects facing stringent energy constraints, researchers are currently pursuing disruptive solutions such as nanophotonics for improved energy-efficiency. Cache coherence in multicores can be enforced effectively by snoopy protocols; however, broadcasting every cache miss can limit the scalability while consuming excess energy. In this paper, we propose PULSE, a nanophotonic broadcast tree-based network for snoopy cache coherent multicores. To limit the energy-penalty from broadcasting (and thereby splitting) optical signals, we direct the optical signal from the external laser such that only the subset of requesters can receive the optical signal. Furthermore, as cache blocks are shared by a few cores, we propose a multicast version of PULSE called multi-PULSE that predicts the sharers' for each L2 miss and morphing the broadcast to a multicast network. We evaluate the energy and performance using CACTI and SIMICS on 16-core and 64-core versions of PULSE and multi-PULSE for Splash-2, PARSEC, and SPEC CPU2006 benchmarks and compare to electrical networks, optical networks, and another cache filtering techniques. Our results indicate that PULSE outperforms competitive electrical/optical networks by 60 percent in terms of execution time, and multi-PULSE reduces average energy from 10 to 80 percent even with a few mispredictions.
Year
DOI
Venue
2014
10.1109/TPDS.2013.26
IEEE Trans. Parallel Distrib. Syst.
Keywords
Field
DocType
energy conservation,nanophotonics,shared memory systems,trees (mathematics),CACTI,PARSEC benchmark,PULSE,SIMICS,SPEC CPU2006 benchmarks,Splash-2 benchmark,cache coherence protocol,improved energy-efficiency,multi-PULSE,multicast network,nanophotonic broadcast tree-based network,nanophotonic technology,on-chip network,overall chip performance,shared memory multicores,shared memory programming,snoopy cache coherent multicores,snoopy protocols,stringent energy constraints,Network-on-chips,broadcast,cache coherence,multicast,nanophotonics
Cache invalidation,Cache pollution,Cache,Snoopy cache,Computer science,Parallel computing,MESI protocol,Cache algorithms,Real-time computing,Cache coloring,Smart Cache,Distributed computing
Journal
Volume
Issue
ISSN
25
1
1045-9219
Citations 
PageRank 
References 
17
0.64
16
Authors
3
Name
Order
Citations
PageRank
Randy Morris1343.72
Evan Jolley2170.98
Avinash Karanth Kodi327029.29