Title
Emotional Voice Conversion Using Dual Supervised Adversarial Networks With Continuous Wavelet Transform F0 Features
Abstract
In emotional voice conversion (VC) tasks, it is difficult to deal with a simple representation of fundamental frequency (F0), which is the most important feature in emotional voice representation. In order to address this issue, we propose the adaptive scales continuous wavelet transform (ADS-CWT) method to systematically capture F0 features of different temporal levels, which can represent different prosodic aspects, ranging from micro-prosody to sentences. Moreover, in an emotional VC task, each dataset is paired with the labeled emotional voice and neutral voice, which can be regarded as a dual task. Owing to, first, dual supervised learning's ability to improve the training performances by using the leveraging probabilistic connection between the dual tasks to enhance the learning from labeled data and, second, generative adversarial networks’ (GANs’) ability to mitigate the over-smoothing problem caused in the low-level data space when converting the acoustic features, we further present a novel training framework for emotional VC using GANs combined with dual supervised learning, named as dual supervised adversarial networks. In emotional VC experiments, we confirmed the high similarity performance of our method when using limited labeled data for emotional VC. Our method achieves good and consistent performance, in both objective and subjective evaluations.
Year
DOI
Venue
2019
10.1109/TASLP.2019.2923951
IEEE/ACM Transactions on Audio, Speech, and Language Processing
Keywords
Field
DocType
Continuous wavelet transforms,Task analysis,Supervised learning,Hidden Markov models,Training
Data space,Fundamental frequency,Pattern recognition,Computer science,Continuous wavelet transform,Supervised learning,Speech recognition,Ranging,Artificial intelligence,Generative grammar,Probabilistic logic,Adversarial system
Journal
Volume
Issue
ISSN
27
10
2329-9290
Citations 
PageRank 
References 
0
0.34
14
Authors
4
Name
Order
Citations
PageRank
Zhaojie Luo162.18
J. Chen211223.18
Tetsuya Takiguchi330852.22
Yasuo Ariki451988.94