Title
Exploiting algorithmic-level memory parallelism in distributed logic-memory architecture through hardware-assisted dynamic graph (abstract only)
Abstract
Emerging FPGA device, integrated with abundant RAM blocks and high-performance processor cores, offers an unprecedented opportunity to effectively implement single-chip distributed logic-memory (DLM) architectures. Being "memory-centric", the DLM architecture can significantly improve the overall performance and energy efficiency of many memory-intensive embedded applications, especially those that exhibit irregular array data access patterns at algorithmic level. However, implementing DLM architecture poses unique challenges to an FPGA designer in terms of 1) organizing and partitioning diverse on-chip memory resources, and 2) orchestrating effective data transmission between on-chip and off-chip memory. In this paper, we offer our solutions to both of these challenges. Specifically, 1) we propose a stochastic memory partitioning scheme based on the well-known simulated annealing algorithm. It obtains memory partitioning solutions that promote parallelized memory accesses by exploring large solution space; 2) we augment the proposed DLM architecture with a reconfigure hardware graph that can dynamically compute precedence relationship between memory partitions, thus effectively exploiting algorithmic level memory parallelism on a per-application basis. We evaluate the effectiveness of our approach (A3) against two other DLM architecture synthesizing methods: an algorithmic-centric reconfigurable computing architectures with a single monolithic memory (A1) and the heterogeneous distributed architectures synthesized according to (A2). All experiments have been conducted with a Virtex-5 (XCV5LX155T-2) FPGA. On average, our experimental results show that our proposed A3 architecture outperforms A2 and A1 by 34% and 250%, respectively. Within the performance improvement of A3 over A2, more than 70% improvement comes from the hardware graph-based memory scheduling.
Year
DOI
Venue
2013
10.1145/2435264.2435333
FPGA
Keywords
Field
DocType
parallelized memory access,off-chip memory,hardware graph-based memory scheduling,logic-memory architecture,hardware-assisted dynamic graph,algorithmic level memory parallelism,single monolithic memory,a3 architecture,dlm architecture,algorithmic-level memory parallelism,diverse on-chip memory resource,stochastic memory,memory partition,fpga,memory
Interleaved memory,Uniform memory access,Shared memory,Computer science,Parallel computing,Cache-only memory architecture,Distributed memory,Computing with Memory,Real-time computing,Distributed shared memory,Computer hardware,Memory architecture
Conference
Citations 
PageRank 
References 
0
0.34
0
Authors
4
Name
Order
Citations
PageRank
Yu Bai1148.86
Abigail Fuentes210.72
Mingjie Lin37325.04
Mike Riera400.34