Title | ||
---|---|---|
Study on Simultaneous Estimation of Glottal Source and Vocal Tract Parameters by ARMAX-LF Model for Speech Analysis/Synthesis |
Abstract | ||
---|---|---|
Correct estimation of glottal source as well as vocal tract parameters is crucial for speech analysis and synthesis. Nearly all methods for estimating these parameters are based on the source-filter assumption. However, the separation and estimation of the source and filter parts are still challenging due to the unreasonable modeling related to physiological processes of speech production or inappropriate estimation procedures. We propose a model that combines the autoregressive moving average exogenous (ARMAX) and Liljencrants- Fant (LF) models, called the ARMAX-LF model, to accurately represent the physiological processes of speech production. The ARMAX model represents the vocal tract as a pole-zero filter with an additional exogenous residual signal, and the LF model represents glottal source wave-form as a parametrized time-domain model. Furthermore, we propose a two-stage iterative estimation procedure to separately and simultaneously estimate the parameters of the ARMAX-LF model. The estimated parameters were evaluated objectively and subjectively with synthesized vowels, synthesized consonants, and natural speech. The results indicate that the ARMAX-LF model with the estimated parameters can separately represent the glottal source and vocal tract characteristics and can be widely used in speech analysis and synthesis. |
Year | Venue | DocType |
---|---|---|
2021 | 2021 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) | Conference |
ISSN | ISBN | Citations |
2640-009X | 978-1-6654-4162-9 | 0 |
PageRank | References | Authors |
0.34 | 0 | 5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Kai Li | 1 | 0 | 0.34 |
Masashi Unoki | 2 | 0 | 0.34 |
Yongwei Li | 3 | 0 | 0.34 |
Jianwu Dang | 4 | 0 | 0.34 |
Masato Akagi | 5 | 0 | 0.34 |