Abstract | ||
---|---|---|
While Coarse-Grained Reconfigurable Architectures (CGRAs) are very efficient at handling regular, compute-intensive loops, their weakness at control-intensive processing and the need for frequent reconfiguration require another processor, for which usually a main processor is used. To minimize the overhead arising in such collaborative execution, we integrate a dedicated sequential processor (SP) with a reconfigurable array (RA), where the crucial problem is how to share the memory between SP and RA while keeping the SP's memory access latency very short. We present a detailed architecture, control, and program example of our approach, focusing on our optimized on-chip shared memory organization between SP and RA. Our preliminary results demonstrate that our optimized memory architecture is very effective in reducing kernel execution times (23.5% compared to a more straightforward alternative), and our approach can reduce the RA control overhead and other sequential code execution time in kernels significantly, resulting in up to 23.1% reduction in kernel execution time, compared to the conventional system using the main processor for sequential code execution. |
Year | DOI | Venue |
---|---|---|
2013 | 10.7873/DATE.2013.320 | DATE |
Keywords | DocType | ISSN |
embedded systems,decoding,multiprocessor,irrigation | Conference | 1530-1591 |
Citations | PageRank | References |
7 | 0.50 | 14 |
Authors | ||
3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Jongeun Lee | 1 | 429 | 33.71 |
Yeonghun Jeong | 2 | 9 | 0.87 |
Sungsok Seo | 3 | 7 | 0.50 |