Pretraining by Backtranslation for End-to-End ASR in Low-Resource Settings - Citegraph

Paper Info

Title
Pretraining by Backtranslation for End-to-End ASR in Low-Resource Settings

Abstract
We explore training attention-based encoder-decoder ASR in low-resource settings. These models perform poorly when trained on small amounts of transcribed speech, in part because they depend on having sufficient target-side text to train the attention and decoder networks. In this paper we address this shortcoming by pretraining our network parameters using only text-based data and transcribed speech from other languages. We analyze the relative contributions of both sources of data. Across 3 test languages, our text-based approach resulted in a 20% average relative improvement over a text-based augmentation technique without pretraining. Using transcribed speech from nearby languages gives a further 20-30% relative reduction in character error rate.

Year	DOI	Venue
2019	10.21437/Interspeech.2019-3254	INTERSPEECH
DocType	Citations	PageRank
Conference	0	0.34
References	Authors
0	6

Authors (6 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Matthew Wiesner	1	5	2.85
Adithya Renduchintala	2	1	1.74
Shinji Watanabe	3	1158	139.38
Chunxi Liu	4	23	3.28
N. Dehak	5	1269	92.64
Sanjeev Khudanpur	6	2155	202.00

1