Hardware Acceleration of Fully Quantized BERT for Efficient Natural Language Processing - Citegraph

Paper Info

Title
Hardware Acceleration of Fully Quantized BERT for Efficient Natural Language Processing

Abstract
BERT is the most recent Transformer-based model that achieves state-of-the-art performance in various NLP tasks. In this paper, we investigate the hardware acceleration of BERT on FPGA for edge computing. To tackle the issue of huge computational complexity and memory footprint, we propose to fully quantize the BERT (FQ-BERT), including weights, activations, softmax, layer normalization, and all t...

Year	DOI	Venue
2021	10.23919/DATE51398.2021.9474043	2021 Design, Automation & Test in Europe Conference & Exhibition (DATE)
Keywords	DocType	ISSN
Computational modeling,Bit error rate,Graphics processing units,Natural language processing,Energy efficiency,Task analysis,Computational complexity	Conference	Design, Automation & Test in Europe (DATE) 2021
ISBN	Citations	PageRank
978-3-9819263-5-4	1	0.43
References	Authors
0	3

Authors (3 rows)

Cited by (1 rows)

References (0 rows)

Name	Order	Citations	PageRank
Zejian Liu	1	2	1.14
Gang Li	2	1	1.45
Jian Cheng	3	1327	115.72

1