Title
Noise-tolerant speech recognition: the SNN-TA approach
Abstract
Neural network learning theory draws a relationship between "learning with noise" and applying a regularization term in the cost function that is minimized during the training process on clean (non-noisy) data. Application of regularizers and other robust training techniques are aimed at improving the generalization capabilities of connectionist models, reducing overfitting. In spite of that, the generalization problem is usually overlooked by automatic speech recognition (ASR) practioners who use hidden Markov models (HMM) or other standard ASR paradigms. Nonetheless, it is reasonable to expect that an adequate neural network model (due to its universal approximation property and generalization capability) along with a suitable regularizer can exhibit good recognition performance whenever noise is added to the test data, although training is accomplished on clean data. This paper presents applications of a variant of the so called segmental neural network (SNN), introduced at BBN by Zavaliagkos et al. for rescoring the N-best hypothesis yielded by a standard continuous density HMM (CDHMM). An enhanced connectionist model, called SNN with trainable amplitude of activation functions (SNN-TA) is first used in this paper instead of the CDHMM to perform the recognition of isolated words. Viterbi-based segmentation is then introduced, relying on the level-building algorithm, that can be combined with the SNN-TA to obtain a hybrid framework for continuous speech recognition. The proposed paradigm is applied to the recognition of isolated and connected Italian digits under several noisy conditions, outperforming the CDHMMs.
Year
DOI
Venue
2003
10.1016/S0020-0255(03)00164-6
Inf. Sci.
Keywords
Field
DocType
adequate neural network model,automatic speech recognition,neural network,generalization capability,clean data,generalization problem,noise-tolerant speech recognition,continuous speech recognition,snn-ta approach,robust training technique,good recognition performance,segmental neural network,activation function,hidden markov model,learning theory,speech recognition,cost function,approximation property,neural network model
Computer science,Regularization (mathematics),Time delay neural network,Artificial intelligence,Overfitting,Artificial neural network,Connectionism,Viterbi algorithm,Pattern recognition,Speech recognition,Test data,Hidden Markov model,Machine learning
Journal
Volume
Issue
ISSN
156
1-2
0020-0255
Citations 
PageRank 
References 
3
0.43
7
Authors
2
Name
Order
Citations
PageRank
Edmondo Trentin128629.25
Marco Matassoni216526.06