Abstract | ||
---|---|---|
One problem in concatenative speech synthesis is how to incorporate prosodic factors in the unit selection. Imposing a predicted prosodic target is error-prone and does not benefit from the prosodic variability of the database. In this paper, we assume that several prosodic contours exist in the database for a same symbolic entry. This variability is represented by probabilistic models of the prosodic contours and the optimal sequence of units is searched by maximizing a joint likelihood at both segmental and prosodic levels. A generalized Viterbi algorithm is used to take into account the long-term dependencies introduced by the prosodic models. This method has been implemented in a unit selection synthesizer using an expressive speech database and a subjective experiment shows an improvement of the speech naturalness compared to a conventional unit-selection method. |
Year | DOI | Venue |
---|---|---|
2011 | 10.1109/ICASSP.2011.5947569 | Acoustics, Speech and Signal Processing |
Keywords | Field | DocType |
maximum likelihood estimation,speech synthesis,generalized Viterbi algorithm,probabilistic models,prosodic control,unit-selection speech synthesis,prosody,speech synthesis,unit selection | Prosody,Speech synthesis,Pattern recognition,Computer science,Naturalness,Maximum likelihood,Context model,Speech recognition,Artificial intelligence,Natural language processing,Probabilistic logic,Viterbi algorithm | Conference |
ISSN | ISBN | Citations |
1520-6149 E-ISBN : 978-1-4577-0537-3 | 978-1-4577-0537-3 | 0 |
PageRank | References | Authors |
0.34 | 7 | 2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Christophe Veaux | 1 | 1531 | 390.95 |
Xavier Rodet | 2 | 627 | 107.87 |