Title
Maximizing CNN Throughput on FPGA Clusters.
Abstract
Field Programmable Gate Array (FPGA) platform has been a popular choice for deploying Convolutional Neural Networks (CNNs) as a result of its high parallelism and low energy consumption. Due to the limitation of on-chip resources on a single board, FPGA clusters become promising solutions to improve the throughput of CNNs. In this paper, we firstly put forward strategies to optimize the resource allocation intra and inter FPGA boards. Then we model the multi-board cluster problem and design algorithms based on knapsack problem and dynamic programming to calculate the optimal topology of the FPGA clusters. We also give a quantitative analysis of the inter-board data transmission bandwidth requirement. To make our design accommodate for more situations, we provide solutions for deploying fully connected layers and special convolution layers with large memory requirement. Experimental results show that typical well-known CNNs with the proposed topology of FPGA clusters could obtain a higher throughput per board than single-board solutions and other multi-board solutions.
Year
DOI
Venue
2020
10.1145/3373087.3375366
FPGA
Field
DocType
ISBN
Cluster (physics),Computer science,Parallel computing,Field-programmable gate array,Throughput
Conference
978-1-4503-7099-8
Citations 
PageRank 
References 
0
0.34
0
Authors
6
Name
Order
Citations
PageRank
Ruihao Li1409.88
Ke Liu22016.97
Mengying Zhao3818.31
Zhaoyan Shen4147.94
Xiaojun Cai564.17
zhiping jia646360.64