Iterative Pseudo-Labeling for Speech Recognition - Citegraph

Paper Info

Title
Iterative Pseudo-Labeling for Speech Recognition

Abstract
Pseudo-labeling has recently shown promise in end-to-end automatic speech recognition (ASR). We study Iterative Pseudo-Labeling (IPL), a semi-supervised algorithm which efficiently performs multiple iterations of pseudo-labeling on unlabeled data as the acoustic model evolves. In particular, IPL fine-tunes an existing model at each iteration using both labeled data and a subset of unlabeled data. We study the main components of IPL: decoding with a language model and data augmentation. We then demonstrate the effectiveness of IPL by achieving state-of-the-art word-error rate on the Librispeech test sets in both standard and low-resource setting. We also study the effect of language models trained on different corpora to show IPL can effectively utilize additional text. Finally, we release a new large in-domain text corpus which does not overlap with the Librispeech training transcriptions to foster research in low-resource, semi-supervised ASR

Year	DOI	Venue
2020	10.21437/Interspeech.2020-1800	INTERSPEECH
DocType	Citations	PageRank
Conference	6	0.42
References	Authors
0	6

Authors (6 rows)

Cited by (6 rows)

References (0 rows)

Name	Order	Citations	PageRank
Qiantong Xu	1	34	7.42
Tatiana Likhomanenko	2	24	5.47
Jacob Kahn	3	20	2.38
Awni Y. Hannun	4	517	27.54
Synnaeve Gabriel	5	21	5.12
Ronan Collobert	6	4002	308.61

1