Title
Inversion of F0 model for natural-sounding speech synthesis.
Abstract
Natural-sounding speech synthesizers require information from a model quantitatively describing prosody. H. Fujisaki's model (see "Dynamic Characteristics of Voice Fundamental Frequency in Speech and Singing", The Production of Speech, Springer-Verlag, p.39-47, 1983) has shown considerable accuracy on many languages (Fujisaki et al., IEEE Int. Conf. on Acoustics, Speech and Sig. Processing, vol.2, p.211-14, 1993; Fujisaki and Ohno, S., Fourth Int. Conf. on Sig. Processing, vol.1, p.714-17,1998). We propose a method for the estimation of Fujisaki's model parameters, i.e., inversion methods, based on the relative extremes of the pitch contour and a gradient algorithm refinement procedure. Preliminary results show excellent performance of the proposed method in matching the pitch contours. Preliminary results of synthesis making use of the obtained features are very encouraging.
Year
DOI
Venue
2003
10.1109/ICASSP.2003.1198832
ICASSP (1)
Keywords
DocType
Volume
feature extraction,gradient methods,natural languages,parameter estimation,speech synthesis,F0 model inversion,Italian continuous speech,fundamental frequency,gradient algorithm refinement procedure,inversion methods,model feature extraction,natural-sounding speech synthesis,parameter estimation,pitch contour,prosody
Conference
1
ISSN
ISBN
Citations 
1520-6149
0-7803-7663-3
0
PageRank 
References 
Authors
0.34
0
3
Name
Order
Citations
PageRank
Pierluigi Salvo Rossi132827.27
Francesco Palmieri200.68
Francesco Cutugno37618.01