Noise-tolerant speech recognition: the SNN-TA approach - Citegraph

Paper Info

Title
Noise-tolerant speech recognition: the SNN-TA approach

Abstract
Neural network learning theory draws a relationship between "learning with noise" and applying a regularization term in the cost function that is minimized during the training process on clean (non-noisy) data. Application of regularizers and other robust training techniques are aimed at improving the generalization capabilities of connectionist models, reducing overfitting. In spite of that, the generalization problem is usually overlooked by automatic speech recognition (ASR) practioners who use hidden Markov models (HMM) or other standard ASR paradigms. Nonetheless, it is reasonable to expect that an adequate neural network model (due to its universal approximation property and generalization capability) along with a suitable regularizer can exhibit good recognition performance whenever noise is added to the test data, although training is accomplished on clean data. This paper presents applications of a variant of the so called segmental neural network (SNN), introduced at BBN by Zavaliagkos et al. for rescoring the N-best hypothesis yielded by a standard continuous density HMM (CDHMM). An enhanced connectionist model, called SNN with trainable amplitude of activation functions (SNN-TA) is first used in this paper instead of the CDHMM to perform the recognition of isolated words. Viterbi-based segmentation is then introduced, relying on the level-building algorithm, that can be combined with the SNN-TA to obtain a hybrid framework for continuous speech recognition. The proposed paradigm is applied to the recognition of isolated and connected Italian digits under several noisy conditions, outperforming the CDHMMs.

Year	DOI	Venue
2003	10.1016/S0020-0255(03)00164-6	Inf. Sci.
Keywords	Field	DocType
adequate neural network model,automatic speech recognition,neural network,generalization capability,clean data,generalization problem,noise-tolerant speech recognition,continuous speech recognition,snn-ta approach,robust training technique,good recognition performance,segmental neural network,activation function,hidden markov model,learning theory,speech recognition,cost function,approximation property,neural network model	Computer science,Regularization (mathematics),Time delay neural network,Artificial intelligence,Overfitting,Artificial neural network,Connectionism,Viterbi algorithm,Pattern recognition,Speech recognition,Test data,Hidden Markov model,Machine learning	Journal
Volume	Issue	ISSN
156	1-2	0020-0255
Citations	PageRank	References
3	0.43	7
Authors
2

Authors (2 rows)

Cited by (3 rows)

References (7 rows)

Name	Order	Citations	PageRank
Edmondo Trentin	1	286	29.25
Marco Matassoni	2	165	26.06

1