Title
A combination of speaker normalization and speech rate normalization for automatic speech recognition
Abstract
In this contribution a normalization procedure for automatic speech recognition is introduced which aims at reducing speaking rate specific variations of the features of the phonetic classes. A "spurtwise" calculation of normalization factors allows to capture changes of the speaking rate within one utterance. The cost- saving implementation using linear interpolation of the original features and a word graph rescoring procedure leads to a moder- ate increase in computational load compared to the baseline sys- tem without speech rate normalization. In addition a two-step procedure which combines vocal tract length normalization (VTLN) and speech rate normalization (SRN) has been developed. Experiments showed, that applying SRN to a VTLN-based recognition system leads to relative re- duction in word error rate of 4.2%. This is comparable to the decrease observed when using SRN on a system without VTLN. All in all the combination of VTLN and SRN results in a 15% reduction of word error rate compared to the baseline system.
Year
Venue
Keywords
2000
INTERSPEECH
linear interpolation,automatic speech recognition,word error rate
Field
DocType
Citations 
Normalization (statistics),Recognition system,Pattern recognition,Computer science,Voice activity detection,Word error rate,Utterance,Speech recognition,Artificial intelligence,Linear interpolation,Baseline system,Vocal tract
Conference
7
PageRank 
References 
Authors
0.59
15
3
Name
Order
Citations
PageRank
Thilo Pfau111315.74
Robert Faltlhauser2263.62
Günther Ruske315436.13