Abstract | ||
---|---|---|
This paper proposes a simple yet effective unsupervised speaker adaptation approach for batch normalization based deep neural network acoustic models. The basic idea of this approach is to recompute means and variances in all batch normalization layers over the test data for every speaker. Thus the distribution of the test data can be close to the training data. This approach doesn't need to adjust any trainable parameters of the acoustic model. Experiments are conducted on CHiME-3 datasets. The results show that the proposed adaptation obtains improvement on the real test set by 2.17 % relative average word error rate (WER) reduction when compared with the scaling and shifting factors (SSF) adaptation. Combining our proposed MV adaptation with the SSF adaptation obtains further improvement. |
Year | DOI | Venue |
---|---|---|
2019 | 10.1109/APSIPAASC47483.2019.9023185 | 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) |
Keywords | DocType | ISSN |
MV adaptation,SSF adaptation,acoustic model,effective unsupervised speaker adaptation approach,deep neural network acoustic models,batch normalization layers,test data,training data,CHiME-3 datasets,relative average word error rate reduction,WER | Conference | 2640-009X |
ISBN | Citations | PageRank |
978-1-7281-3249-5 | 0 | 0.34 |
References | Authors | |
7 | 2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Jiangyan Yi | 1 | 19 | 17.99 |
Jianhua Tao | 2 | 848 | 138.00 |