Abstract | ||
---|---|---|
Despite the growing interest for expressive speech synthesis, synthesis of nonverbal expressions is an under-explored area. In this paper we propose an audio laughter synthesis system based on a sequence-to-sequence TTS synthesis system. We leverage transfer learning by training a deep learning model to learn to generate both speech and laughs from annotations. We evaluate our model with a listening test, comparing its performance to an HMM-based laughter synthesis one and assess that it reaches higher perceived naturalness. Our solution is a first step towards a TTS system that would be able to synthesize speech with a control on amusement level with laughter integration. |
Year | DOI | Venue |
---|---|---|
2020 | 10.21437/Interspeech.2020-1423 | INTERSPEECH |
DocType | Citations | PageRank |
Conference | 0 | 0.34 |
References | Authors | |
0 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Noé Tits | 1 | 0 | 1.35 |
kevin el haddad | 2 | 34 | 9.01 |
T. Dutoit | 3 | 313 | 30.47 |