Title | ||
---|---|---|
Speaker normalization and pronunciation variant modeling: helpful methods for improving recognition of fast speech |
Abstract | ||
---|---|---|
The presented paper addresses the problem of creating hidden Markov models for fast speech. The major issues discussed are robust parameter estimation and reducing within-model variations. Regarding the first issue, the use of the maximum a posteriori parameter estimation is discussed. To reduce within-model variations, a maximum likelihood b ased vocal tract length normalization procedure and a statistical approach to model pronunciation variants are applied. Experiments with a large vocabulary continuous speech recognition system were carried out on the German spontaneous scheduling task (Verbmobil) to prove the effectiveness of the investigated methods. The results s how that a combination of pronunciation variant modeling and vocal t ract l ength n ormalization is most effective. On fast speech, a relative improvement of 16.3% compared to the baseline models was achieved. Pronunciation variant modeling combined with the maximum a posteriori reestimation proved to be the second b est method resulting in a 14.9% r elative improvement. In addition, this combination does not cause any additional computational load during recognition. |
Year | Venue | Keywords |
---|---|---|
1999 | EUROSPEECH | hidden markov model,maximum likelihood,parameter estimation |
Field | DocType | Citations |
Pronunciation,Normalization (statistics),Pattern recognition,Computer science,Speech recognition,Speaker recognition,Artificial intelligence,Maximum a posteriori estimation,Estimation theory,Hidden Markov model,Vocabulary,Vocal tract | Conference | 2 |
PageRank | References | Authors |
0.38 | 16 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Thilo Pfau | 1 | 113 | 15.74 |
Robert Faltlhauser | 2 | 26 | 3.62 |
Günther Ruske | 3 | 154 | 36.13 |