Abstract | ||
---|---|---|
Young speakers are not represented adequately in current speech recognizers. In this paper we focus on the problem to adapt the acoustic frontend of a speech recognizer which has been trained on adults' speech to achieve a better performance on speech from children. We introduce and evaluate a method to perform non-linear VTLN by an unconstrained data-driven op- timization of the filterbank. A second approach normalizes the speaking rate of the young speakers with the PSOLA algorithm. Significant reductions in word error rate have been achieved. |
Year | Venue | Keywords |
---|---|---|
2003 | INTERSPEECH | word error rate |
Field | DocType | Citations |
PSOLA,Normalization (statistics),Computer science,Word error rate,Filter bank,Speech recognition | Conference | 20 |
PageRank | References | Authors |
1.57 | 9 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Georg Stemmer | 1 | 132 | 15.27 |
Christian Hacker | 2 | 235 | 22.51 |
Stefan Steidl | 3 | 1140 | 79.71 |
Elmar Nöth | 4 | 959 | 158.94 |