Title
RECONSTRUCTING DUAL LEARNING FOR NEURAL VOICE CONVERSION USING RELATIVELY FEW SAMPLES
Abstract
This paper introduces a dual learning system for neural voice conversion (DualVC) using relatively few samples based on the symmetry of the speech conversion task. The system contains a pair of sequence-to-sequence neural networks that have the same structure but are trained in opposite directions. The objective function of the dual model training is the sum of paired conversion loss and reconstruction loss during the dual training circle. The models in the two directions are trained alternately to guide each other by the corresponding reconstruction loss. Furthermore, curriculum learning techniques are used to load models in existing fields into the current task to accelerate the rapid iteration and convergence of the model. The experiment on the voice conversion task with the proposed DualVC and curriculum learning strategy obtained a comparable naturalness and similarity with only a 30% dataset than the BaseVC model trained on the full dataset.
Year
DOI
Venue
2021
10.1109/ASRU51503.2021.9687965
2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU)
Keywords
DocType
Citations 
speech synthesis, voice conversion, dual learning, sequence-to-sequence, curriculum learning
Conference
0
PageRank 
References 
Authors
0.34
0
8
Name
Order
Citations
PageRank
Aolan Sun100.34
Jianzong Wang200.34
Ning Cheng300.34
Methawee Tantrawenith400.34
Zhiyong Wu500.34
Helen Meng600.34
Edward Xiao701.01
Jing Xiao800.34