Title
Study on Simultaneous Estimation of Glottal Source and Vocal Tract Parameters by ARMAX-LF Model for Speech Analysis/Synthesis
Abstract
Correct estimation of glottal source as well as vocal tract parameters is crucial for speech analysis and synthesis. Nearly all methods for estimating these parameters are based on the source-filter assumption. However, the separation and estimation of the source and filter parts are still challenging due to the unreasonable modeling related to physiological processes of speech production or inappropriate estimation procedures. We propose a model that combines the autoregressive moving average exogenous (ARMAX) and Liljencrants- Fant (LF) models, called the ARMAX-LF model, to accurately represent the physiological processes of speech production. The ARMAX model represents the vocal tract as a pole-zero filter with an additional exogenous residual signal, and the LF model represents glottal source wave-form as a parametrized time-domain model. Furthermore, we propose a two-stage iterative estimation procedure to separately and simultaneously estimate the parameters of the ARMAX-LF model. The estimated parameters were evaluated objectively and subjectively with synthesized vowels, synthesized consonants, and natural speech. The results indicate that the ARMAX-LF model with the estimated parameters can separately represent the glottal source and vocal tract characteristics and can be widely used in speech analysis and synthesis.
Year
Venue
DocType
2021
2021 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)
Conference
ISSN
ISBN
Citations 
2640-009X
978-1-6654-4162-9
0
PageRank 
References 
Authors
0.34
0
5
Name
Order
Citations
PageRank
Kai Li100.34
Masashi Unoki200.34
Yongwei Li300.34
Jianwu Dang400.34
Masato Akagi500.34