JHU Kaldi system for Arabic MGB-3 ASR challenge using diarization, audio-transcript alignment and transfer learning - Citegraph

Paper Info

Title
JHU Kaldi system for Arabic MGB-3 ASR challenge using diarization, audio-transcript alignment and transfer learning

Abstract
This paper describes the JHU team's Kaldi system submission to the Arabic MGB-3: The Arabic speech recognition in the Wild Challenge for ASRU-2017. We use a weights transfer approach to adapt a neural network trained on the out-of-domain MGB-2 multi-dialect Arabic TV broadcast corpus to the MGB-3 Egyptian YouTube video corpus. The neural network has a TDNN-LSTM architecture and is trained using lattice-free maximum mutual information (LF-MMI) objective followed by sMBR discriminative training. For supervision, we fuse transcripts from 4 independent transcribers into confusion network training graphs. We also describe our own approach for speaker diarization and audio-transcript alignment. We use this to prepare lightly supervised transcriptions for training the seed system used for adaptation to MGB-3. Our primary submission to the challenge gives a multi-reference WER of 32.78% on the MGB-3 test set.

Year	DOI	Venue
2017	10.1109/ASRU.2017.8268956	2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)
Keywords	DocType	ISBN
Multi-genre broadcast,Automatic speech recognition,Lightly-supervised training,LF-MMI,Segmentation	Conference	978-1-5090-4789-5
Citations	PageRank	References
0	0.34	0
Authors
3

Authors (3 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Vimal Manohar	1	54	7.99
Daniel Povey	2	2442	231.75
Sanjeev Khudanpur	3	2155	202.00

1