Speaker Adaptation for End-to-End CTC Models. - Citegraph

Paper Info

Title
Speaker Adaptation for End-to-End CTC Models.

Abstract
We propose two approaches for speaker adaptation in end-to-end (E2E) automatic speech recognition systems. One is Kullback-Leibler divergence (KLD) regularization and the other is multi-task learning (MTL). Both approaches aim to address the data sparsity especially output target sparsity issue of speaker adaptation in E2E systems. The KLD regularization adapts a model by forcing the output distribution from the adapted model to be close to the unadapted one. The MTL utilizes a jointly trained auxiliary task to improve the performance of the main task. We investigated our approaches on E2E connectionist temporal classification (CTC) models with three different types of output units. Experiments on the Microsoft short message dictation task demonstrated that MTL outperforms KLD regularization. In particular, the MTL adaptation obtained 8.8% and 4.0% relative word error rate reductions (WERRs) for supervised and unsupervised adaptations for the word CTC model, and 9.6% and 3.8% relative WERRs for the mix-unit CTC model, respectively.

Year	DOI	Venue
2018	10.1109/SLT.2018.8639644	2018 IEEE Spoken Language Technology Workshop (SLT)
Keywords	DocType	Volume
Adaptation models,Task analysis,Data models,Decoding,Acoustics,Artificial intelligence,Training	Conference	abs/1901.01239
ISSN	ISBN	Citations
2639-5479	978-1-5386-4334-1	0
PageRank	References	Authors
0.34	25	5

Authors (5 rows)

Cited by (0 rows)

References (25 rows)

Name	Order	Citations	PageRank
Ke Li	1	50	26.41
Jinyu Li	2	915	72.84
Yong Zhao	3	127	13.62
Kshitiz Kumar	4	95	10.82
Yifan Gong	5	1332	135.58

1