Abstract | ||
---|---|---|
This paper deals with the problem of building HMMs suitable for fast speech. Fast speech leads to increased error rates on various tasks. In the first part of the paper an automatic procedure is presented to split speech material into different categories according to the speaking rate, which is fundamental for all investigations on the speaking rate. In the second part the problem of sparse data available for the estimation of HMMs for fast speech is discussed. A com- parison of different methods to overcome this problem follows. The main emphasis here is set on robust reestimation tech- niques like maximum aposteriori estimation (MAP) as well as on methods to reduce the variability of the speech signal and therefore to be able to reduce the number of HMM parameters. Vocaltract length normalization (VTLN) is chosen for that purpose. In the last part a comparison of various combinations of the methods discussed is presented basing on error rates for continuous speech recognition on fast speech. The best method (VTLN followed by MAP reestimation) results in an overall decrease of the error rate of 10% relative to the baseline system. |
Year | Venue | Keywords |
---|---|---|
1998 | ICSLP | hidden markov model,error rate,sparse data |
Field | DocType | Citations |
Maximum-entropy Markov model,Pattern recognition,Markov model,Computer science,Speech recognition,Artificial intelligence,Variable-order Markov model,Hidden Markov model,Viterbi algorithm,Hidden semi-Markov model | Conference | 7 |
PageRank | References | Authors |
0.57 | 14 | 2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Thilo Pfau | 1 | 113 | 15.74 |
Günther Ruske | 2 | 154 | 36.13 |