Exploration of Low Numeric Precision Deep Learning Inference Using Intel® FPGAs: (Abstract Only). - Citegraph

Paper Info

Title
Exploration of Low Numeric Precision Deep Learning Inference Using Intel® FPGAs: (Abstract Only).

Abstract
Convolutional neural networks have been shown to maintain reasonable classification accuracy when quantized down to 8-bits, however, quantizing to sub 8-bit activations and weights can result in classification accuracy falling below an acceptable threshold. Techniques exist for increasing accuracy of sub 8-bit networks typically by means of increasing computation resulting in a trade-off between throughput and accuracy and can be tailored for different networks through combinations of activation and weight precisions. Customizable hardware architectures like FPGAs provide opportunity for data width specific computation through unique logic configurations leading to highly optimized processing that is unattainable by full precision networks. Specifically, ternary and binary weighted networks offer an efficient method of inference for 2-bit and 1-bit data respectively. In this paper, we present a hardware design for FPGAs that takes advantage of the bandwidth, memory, and computation savings of limited numerical precision data. We provide insights into the trade-offs between throughput and accuracy for various networks and how they map to our framework. Further, we show how limited numeric precision computation can be efficiently mapped onto FPGAs for both ternary and binary cases. Starting with Arria 10, we show a 2-bit activation and ternary weighted AlexNet running in hardware that achieves 3,700 images per second on the ImageNet dataset with a top-1 accuracy of 0.49. Using a hardware modeler designed for our low numeric precision framework we project performance most notably for a 55.5 TOPS Stratix 10 device running a modified ResNet-34 with only 3.7% accuracy degradation compared with single precision.

Year	Venue	Field
2018	FPGA	Single-precision floating-point format,Stratix,Computer science,Convolutional neural network,Parallel computing,Field-programmable gate array,Bandwidth (signal processing),Artificial intelligence,Throughput,Deep learning,Computer engineering,Computation
DocType	ISBN	Citations
Conference	978-1-4503-5614-5	0
PageRank	References	Authors
0.34	0	6

Authors (6 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
philip colangelo	1	10	4.58
Nasibeh Nasiri	2	14	3.14
Eriko Nurvitadhi	3	399	33.08
Asit K. Mishra	4	1216	46.21
Martin Margala	5	75	9.77
Kevin Nealis	6	4	1.74

1