Title
Interactive Intonation Optimisation Using CMA-ES and DCT Parameterisation of the F0 Contour for Speech Synthesis.
Abstract
Expressive speech is one of the latest concerns of text-to-speech systems. Due to the subjectivity of expression and emotion realisation in speech, humans cannot objectively determine if one system is more expressive than the other. Most of the text-to-speech systems have a rather flat intonation and do not provide the option of changing the output speech. We therefore present an interactive intonation optimisation method based on the pitch contour parameterisation and evolution strategies. The Discrete Cosine Transform (DCT) is applied to the phrase level pitch contour. Then, the genome is encoded as a vector that contains 7 most significant DCT coefficients. Based on this initial individual, new speech samples are obtained using an interactive Covariance Matrix Adaptation Evolution Strategy (CMA-ES) algorithm. We evaluate a series of parameters involved in the process, such as the initial standard deviation, population size, the dynamic expansion of the pitch over the generations and the naturalness and expressivity of the resulted individuals. The results have been evaluated on a Romanian parametric-based speech synthesiser and provide the guidelines for the setup of an interactive optimisation system, in which the users can subjectively select the individual which best suits their expectations with minimum amount of fatigue.
Year
DOI
Venue
2011
10.1007/978-3-642-24094-2_4
Studies in Computational Intelligence
Field
DocType
Volume
Pitch contour,Speech synthesis,Computer science,Discrete cosine transform,Phrase,Speech recognition,Mean opinion score,Evolution strategy,Parametric statistics,CMA-ES
Conference
387
ISSN
Citations 
PageRank 
1860-949X
0
0.34
References 
Authors
15
5
Name
Order
Citations
PageRank
Adriana Stan1367.23
Florin-Claudiu Pop2202.19
Marcel Cremene37210.22
Mircea Giurgiu4115.19
Denis Pallez5656.79