Title
9.8 A 25mm<sup>2</sup> SoC for IoT Devices with 18ms Noise-Robust Speech-to-Text Latency via Bayesian Speech Denoising and Attention-Based Sequence-to-Sequence DNN Speech Recognition in 16nm FinFET
Abstract
Automatic speech recognition (ASR) using deep learning is essential for user interfaces on IoT devices. However, previously published ASR chips [4-7] do not consider realistic operating conditions, which are typically noisy and may include more than one speaker. Furthermore, several of these works have implemented only small-vocabulary tasks, such as keyword-spotting (KWS), where context-blind deep neural network (DNN) algorithms are adequate. However, for large-vocabulary tasks (e.g., >100k words), the more complex bidirectional RNNs with an attention mechanism [1] provide context learning in long sequences, which improve ASR accuracy by up to 62% on the 200kwords LibriSpeech dataset, compared to a simpler unidirectional RNN (Fig. 9.8.1). Attention-based networks emphasize the most relevant parts of the source sequence during each decoding time step. In doing so, the encoder sequence is treated as a soft-addressable memory whose positions are weighted based on the state of the decoder RNN. Bidirectional RNNs learn past and future temporal information by concatenating forward and backward time steps.
Year
DOI
Venue
2021
10.1109/ISSCC42613.2021.9366062
2021 IEEE International Solid- State Circuits Conference (ISSCC)
Keywords
DocType
Volume
SoC,IoT devices,bayesian speech denoising,sequence-to-sequence DNN speech recognition,FinFET,automatic speech recognition,deep learning,user interfaces,ASR chips,realistic operating conditions,small-vocabulary tasks,large-vocabulary tasks,complex bidirectional RNNs,attention mechanism,context learning,long sequences,ASR accuracy,200kwords LibriSpeech dataset,attention-based networks,source sequence,encoder sequence,context-blind deep neural network,noise-robust speech-to-text latency,bidirectional RNN,decoder RNN,soft-addressable memory,time 18.0 ms,size 16.0 nm
Conference
64
ISSN
ISBN
Citations 
0193-6530
978-1-7281-9550-6
3
PageRank 
References 
Authors
0.43
0
10
Name
Order
Citations
PageRank
Thierry Tambe1183.43
En-Yu Yang2102.31
Glenn G. Ko3103.30
Yuji Chai452.16
Coleman Hooper571.17
Marco Donato6315.83
Paul N. Whatmough714720.59
Alexander M. Rush8149967.53
David Brooks95518422.08
Gu-Yeon Wei101927214.15