Title | ||
---|---|---|
Learning From The Best: A Teacher-Student Multilingual Framework For Low-Resource Languages |
Abstract | ||
---|---|---|
The traditional method of pretraining neural acoustic models in low-resource languages consists of initializing the acoustic model parameters with a large, annotated multilingual corpus and can be a drain on time and resources. In an attempt to reuse TDNN-LSTMs already pre-trained using multilingual training, we have applied Teacher-Student ( TS) learning as a method of pretraining to transfer knowledge from a multilingual TDNN-LSTM to a TDNN. The pretraining time is reduced by an order of magnitude with the use of language-specific data during the teacher-student training. Additionally, the TS architecture allows us to leverage untranscribed data, previously untouched during supervised training. The best student TDNN achieves a WER within 1% of the teacher TDNN-LSTM performance and shows consistent improvement in recognition over TDNNs trained using the traditional pipeline over all the evaluation languages. Switching to TDNN from TDNN-LSTM also allows sub-real time decoding. |
Year | DOI | Venue |
---|---|---|
2019 | 10.1109/icassp.2019.8683491 | 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) |
Keywords | Field | DocType |
Teacher-student learning, Low-resource speech, Multilingual training, Automatic speech recognition | Architecture,Pattern recognition,Computer science,Reuse,Speech recognition,Time delay neural network,Artificial intelligence,Supervised training,Decoding methods,Initialization,Acoustic model | Conference |
ISSN | Citations | PageRank |
1520-6149 | 0 | 0.34 |
References | Authors | |
0 | 2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Deblin Bagchi | 1 | 4 | 0.78 |
William Hartmann | 2 | 64 | 10.66 |