Title
Improving the computational performance of standard GMM-based voice conversion systems used in real-time applications
Abstract
Voice conversion (VC) can be described as finding a mapping function which transforms the features extracted from a source speaker to those of a target speaker. Gaussian mixture model (GMM) based conversion is the most commonly used technique in VC, but is often sensitive to overfitting and oversmoothing. To address these issues, we propose a secondary classification by applying a K-means classification in each class obtained by a primary classification in order to obtain more precise local conversion functions. This proposal avoids the need for complex training algorithms because the local mapping functions are determined at the same time. The proposed approach consists of a Fourier cepstral analysis, followed by a training phase in order to find the local mapping functions which transform the vocal tract characteristics of the source speaker into those of the target speaker. The converted parameters together with excitation and phase extracted from the target training space using a frame index selection are used in the synthesis step to generate a converted speech with target speech characteristics. Objective and subjective experiments prove that the proposed technique outperforms the baseline GMM approach while greatly reducing the training and transformation computation times.
Year
DOI
Venue
2018
10.1109/ICECOCS.2018.8610514
2018 International Conference on Electronics, Control, Optimization and Computer Science (ICECOCS)
Keywords
DocType
ISBN
Voice conversion,Gaussian Mixture Models,classification,local mapping functions
Conference
978-1-5386-7868-8
Citations 
PageRank 
References 
1
0.35
0
Authors
3
Name
Order
Citations
PageRank
Imen Ben Othmane110.69
Joseph Di Martino211.03
Kais Ouni394.40