CONVOLUTIONAL DROPOUT AND WORDPIECE AUGMENTATION FOR END-TO-END SPEECH RECOGNITION - Citegraph

Paper Info

Title
CONVOLUTIONAL DROPOUT AND WORDPIECE AUGMENTATION FOR END-TO-END SPEECH RECOGNITION

Abstract
Regularization and data augmentation are crucial to training end-to-end automatic speech recognition systems. Dropout is a popular regularization technique, which operates on each neuron independently by multiplying it with a Bernoulli random variable. We propose a generalization of dropout, called "convolutional dropout", where each neuron's activation is replaced with a randomly-weighted linear combination of neuron values in its neighborhood. We believe that this formulation combines the regularizing effect of dropout with the smoothing effects of the convolution operation. In addition to convolutional dropout, this paper also proposes using random word-piece segmentations as a data augmentation scheme during training, inspired by results in neural machine translation. We adopt both these methods during the training of transformer-transducer speech recognition models, and show consistent WER improvements on Librispeech as well as across different languages.

Year	DOI	Venue
2021	10.1109/ICASSP39728.2021.9415004	2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021)
Keywords	DocType	Citations
End-to-end speech recognition, dropout, regularization, Transformer, RNN-transducer	Conference	0
PageRank	References	Authors
0.34	2	5

Authors (5 rows)

Cited by (0 rows)

References (2 rows)

Name	Order	Citations	PageRank
Hainan Xu	1	14	5.56
Yinghui Huang	2	1	2.39
Yun Zhu	3	0	1.69
Kartik Audhkhasi	4	189	23.25
Bhuvana Ramabhadran	5	1779	153.83

1