Title
Semi-Supervised Maximum Mutual Information Training Of Deep Neural Network Acoustic Models
Abstract
Maximum Mutual Information (MMI) is a popular discriminative criterion that has been used in supervised training of acoustic models for automatic speech recognition. However, standard discriminative training is very sensitive to the accuracy of the transcription and hence its implementation in a semi supervised setting requires extensive filtering of data. We will show that if the supervision transcripts are not known, the natural analogue of MMI is to minimize the conditional entropy of the lattice of possible transcripts of the data. This is equivalent to the weighted average of MMI criterion over different reference transcripts, taking those reference transcripts and their weighting from the lattice itself. In this paper we describe experiments where we applied this method to the semi-supervised training of Deep Neural Network acoustic models. In our experimental setup, the proposed method gives up to 0.5% absolute WER improvement over a DNN trained with sMBR only on the transcribed part of the data. This is 37% of the improvement that we would get from doing sMBR training if we had the transcripts for the untranscribed part of the data.
Year
Venue
Keywords
2015
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5
Semi-supervised Learning, Lattice Entropy, Deep Neural Network, Acoustic Modeling, Speech Recognition
Field
DocType
Citations 
Weighting,Pattern recognition,Computer science,Filter (signal processing),Speech recognition,Artificial intelligence,Mutual information,Supervised training,Conditional entropy,Artificial neural network,Discriminative model
Conference
7
PageRank 
References 
Authors
0.66
22
3
Name
Order
Citations
PageRank
Vimal Manohar1547.99
Daniel Povey22442231.75
Sanjeev Khudanpur32155202.00