Title
A GALS many-core heterogeneous DSP platform with source-synchronous on-chip interconnection network
Abstract
This paper presents a many-core heterogeneous computational platform that employs a GALS compatible circuit-switched on-chip network. The platform targets streaming DSP and embedded applications that have a high degree of task-level parallelism among computational kernels. The test chip was fabricated in 65nm CMOS consisting of 164 simple small programmable cores, three dedicated-purpose accelerators and three shared memory modules. All processors are clocked by their own local oscillators and communication is achieved through a simple yet effective source-synchronous communication technique that allows each interconnection link between any two processors to sustain a peak throughput of one data word per cycle. A complete 802.11a WLAN baseband receiver was implemented on this platform. It has a real-time throughput of 54 Mbps with all processors running at 594 MHz and 0.95 V, and consumes an average 174.76 mW with 12.18 mW (or 7.0%) dissipated by its interconnection links. We can fully utilize the benefit of the GALS architecture and by adjusting each processor's oscillator to run at a workload-based optimal clock frequency with the chip's dual supply voltages set at 0.95 V and 0.75 V, the receiver consumes only 123.18 mW, a 29.5% in power reduction. Measured results of its power consumption on the real chip come within the difference of only 2-5% compared with the estimated results showing our design to be highly reliable and efficient.
Year
DOI
Venue
2009
10.1109/NOCS.2009.5071470
NOCS
Keywords
Field
DocType
interconnection link,many-core heterogeneous computational platform,test chip,effective source-synchronous communication technique,real chip,computational kernel,peak throughput,wlan baseband receiver,gals compatible circuit-switched on-chip,gals architecture,source-synchronous on-chip interconnection network,many-core heterogeneous dsp platform,parallel processing,network on a chip,concurrent computing,cmos,real time,processors,circuit switched,local oscillator,computer networks,shared memory,chip,digital signal processing,synchronization,throughput,cmos integrated circuits
Baseband,Shared memory,Computer science,Real-time computing,CMOS,Chip,Throughput,Source-synchronous,Interconnection,Clock rate,Embedded system
Conference
Citations 
PageRank 
References 
4
0.40
22
Authors
3
Name
Order
Citations
PageRank
Anh T. Tran1201.49
Dean N. Truong240.74
Bevan M. Baas329527.78