Title | ||
---|---|---|
DLWAP-buffer: A Novel HW/SW Architecture to Alleviate the Cache Coherence on Streaming-like Data in CMP |
Abstract | ||
---|---|---|
In shared-memory Chip Multiprocessor (CMP), shared data between different cores must be exchanged through the last-level-shared-cache and cache coherence must be maintained at the same time. As the number of cores increase, the cache coherence wall has become more and more serious. As for the multimedia applications full of streaming-like data, existing multicore cache coherence protocols show lower performance and cannot meet the timeliness. In the paper, considering the poor temporal locality and high real-time characteristics of the multimedia data, we propose the distributed light-weight active-push buffer (DWALP-buffer) architecture to alleviate the cache coherence latency on streaming-like data in CMP. The architecture introduces a dedicated shared-data exchange channel between adjacent cores. The channel bridges the internal register files and reduces the shared-data communication latency. Supported by the control protocol, the architecture can adaptively balance the rate mismatch in producer-consumer pipeline model. We build a quad-core CMP simulation platform with the DWLAP-buffers. Our experiment indicates that comparing with the last-shared-level-cache method the architecture can improve the average performance by 13% and alleviate the snooping operations caused from maintaining cache coherence by 26%. |
Year | DOI | Venue |
---|---|---|
2012 | 10.1109/MCSoC.2012.19 | MCSoC |
Keywords | Field | DocType |
cmp,average performance,multicore cache coherence protocol,shared-memory chip multiprocessor,cores increase,multicore cache coherence protocols,dlwap-buffer,cache storage,sw architecture,hw-sw architecture,cache coherence,multimedia applications,shared data,cache coherence latency,streaming-like data,last-level-shared-cache method,shared-data communication latency reduction,shared memory systems,internal register files,producer-consumer pipeline model,quad-core cmp simulation platform,media streaming,multicore,memory architecture,cache coherence wall,adaptive,multimedia data,dedicated shared-data exchange channel,distributed light-weight active-push buffer architecture,novel hw,computer architecture,pipelines,registers,coherence,instruction sets,protocols | Computer architecture,Cache invalidation,Cache pollution,Cache,Computer science,MESIF protocol,MESI protocol,Cache algorithms,Bus sniffing,Smart Cache | Conference |
ISBN | Citations | PageRank |
978-0-7695-4800-5 | 0 | 0.34 |
References | Authors | |
7 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Xiaoping Huang | 1 | 0 | 1.01 |
Fan Xiaoya | 2 | 26 | 13.12 |
Shengbing Zhang | 3 | 6 | 4.89 |
Yuhui Chen | 4 | 13 | 4.26 |