Emotional Speech Synthesis For Multi-Speaker Emotional Dataset Using Wavenet Vocoder - Citegraph

Paper Info

Title
Emotional Speech Synthesis For Multi-Speaker Emotional Dataset Using Wavenet Vocoder

Abstract
This paper studies the methods for emotional speech synthesis using a neural vocoder. For a neural vocoder, WaveNet is used, which generates waveforms from mel spectrograms. We propose two networks, i.e., deep convolutional neural network (CNN)-based text-to-speech (TTS) system and emotional converter, and deep CNN architecture is designed as to utilize long-term context information. The first network estimates neutral mel spectrograms using linguistic features, and the second network converts neutral mel spectrograms to emotional mel spectrograms. Experimental results on a TTS system and emotional TTS system, showed that the proposed systems are indeed a promising approach.

Year	DOI	Venue
2019	10.1109/ICCE.2019.8661919	2019 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS (ICCE)
Field	DocType	ISSN
Computer vision,Speech synthesis,Computer science,Convolutional neural network,Spectrogram,Speech recognition,Artificial intelligence	Conference	2158-3994
Citations	PageRank	References
0	0.34	0
Authors
4

Authors (4 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Heejin Choi	1	6	1.80
sangjun park	2	2	2.43
Jinuk Park	3	2	2.74
Minsoo Hahn	4	223	46.63

1