Title
Improving deep neural networks using state projection vectors of subspace Gaussian mixture model as features
Abstract
Recent advancement in deep neural network (DNN) has surpassed the conventional hidden Markov model-Gaussian mixture model (HMM-GMM) framework due to its efficient training procedure. Providing better phonetic context information in the input gives improved performance for DNN. The state projection vectors (state specific vectors) in subspace Gaussian mixture model (SGMM) captures the phonetic information in low dimensional vector space. In this paper, we propose to use state specific vectors of SGMM as features thereby providing additional phonetic information for the DNN framework. To each observation vector in the train data, the corresponding state specific vectors of SGMM are aligned to form the state specific vector feature set. Linear discriminant analysis (LDA) feature set are formed by applying LDA to the training data. Since bottleneck features are efficient in extracting useful discriminative information for the phonemes, LDA feature set and state specific vector feature set are converted to bottleneck features. These bottleneck features of both feature sets act as input features to train a single DNN framework. Relative improvement of 8.8% for TIMIT database (core test set) and 9.7% for WSJ corpus is obtained by using the state specific vector bottleneck feature set when compared to the DNN trained only with LDA bottleneck feature set. Also training Deep belief network - DNN (DBN-DNN) using the proposed feature set attains a WER of 20.46% on TIMIT core test set proving the effectiveness of our method. The state specific vectors while acting as features, provide additional useful information related to phoneme variation. Thus by combining it with LDA bottleneck features improved performance is obtained using the DNN framework.
Year
DOI
Venue
2014
10.1109/SLT.2014.7078562
SLT
Keywords
Field
DocType
speech processing,deep neural networks,phonetic context information,discriminative information extraction,statistical analysis,learning (artificial intelligence),state projection vectors,subspace gaussian mixture model,mixture models,sgmm,deep belief network training,deep neural network,linear discriminant analysis,low dimensional vector space,hmm-gmm framework,dbn-dnn,wsj corpus,gaussian processes,lda feature set,timit database,state specific vector bottleneck feature set,state specific vectors,bottleneck features,hidden markov models,neural nets,phoneme variation,hidden markov model-gaussian mixture model
Subspace Gaussian Mixture Model,Bottleneck,TIMIT,Pattern recognition,Computer science,Deep belief network,Speech recognition,Artificial intelligence,Linear discriminant analysis,Hidden Markov model,Mixture model,Test set
Conference
ISSN
Citations 
PageRank 
2639-5479
0
0.34
References 
Authors
4
2
Name
Order
Citations
PageRank
Murali Karthick B100.34
Srinivasan Umesh29316.31