Title
Dynamic Assignment of Gaussian Components in Modelling Speech Spectra
Abstract
In this paper, we describe a parametric mixture model for modelling the resonant characteristics of the vocal tract where Gaussian distributions are used to model spectral frequency regions. A mixtures of Gaussian (MoG) based parametrisation scheme is used for modelling a smoothed representation of the spectra. This smoothing procedure removes all signal periodicity from the spectra allowing highly natural analysis, manipulation and synthesis of speech. The goal of this parametrisation scheme is to ease the correspondence between the resonant characteristics of the vocal tract and the parametric distributions and modelling the spectrum with an appropriate number of parameters. Previously, a maximum likelihood (ML) approach to this parametrisation scheme was introduced. However, this approach has inherent local optima problems. Noting that, a relatively small class of Gaussian densities can approximate a large class of distributions, we propose a new scheme whereby starting with a large number of distributions in the mixture, we systematically reduce their number and re-approximate the densities in the mixture based on a distance criterion. The Kullback-Leibler (KL) distance was found to allow optimal MoG solutions to the spectra. Furthermore, a fitness measure based on KL information is used to provide a figure for estimating the model order in representing formant-like features. The proposed model is subjectively evaluated and is shown to reduce the number of Gaussian with an appreciable loss in the quality of the re-synthesised speech.
Year
DOI
Venue
2006
10.1007/s11265-006-9768-3
VLSI Signal Processing
Keywords
Field
DocType
ML-MoG algorithm,parametrisation,Gaussian spectral model
Parametrization,Local optimum,Algorithm,Theoretical computer science,Speech recognition,Spectral line,Smoothing,Gaussian,Parametric statistics,Mixture model,Vocal tract,Mathematics
Journal
Volume
Issue
ISSN
45
1-2
0922-5773
Citations 
PageRank 
References 
4
0.55
3
Authors
6
Name
Order
Citations
PageRank
Parham Zolfaghari15711.43
Hiroko Kato240.55
Yasuhiro Minami340.55
Atsushi Nakamura440.55
Shigeru Katagiri5850114.01
Roy D. Patterson622434.70