Title
Cross-Lingual Voice Conversion With Bilingual Phonetic Posteriorgram And Average Modeling
Abstract
This paper presents a cross-lingual voice conversion approach using bilingual Phonetic PosteriorGram (PPG) and average modeling. The proposed approach makes use of bilingual PPGs to represent speaker-independent features of speech signals from different languages in the same feature space. In particular, a bilingual PPG is formed by stacking two monolingual PPG vectors, which are extracted from two monolingual speech recognition systems. The conversion model is trained to learn the relationship between bilingual PPGs and the corresponding acoustic features. To leverage the linguistic and acoustic information from other speakers in different languages, an average model is trained with multiple speakers in both source and target languages. I-vector is utilized as an additional input feature of the average model for network adaptation. Experiments are performed for intralingual and cross-lingual voice conversion between English and Mandarin speakers. Both objective and subjective evaluations demonstrate the effectiveness of our proposed approach.
Year
DOI
Venue
2019
10.1109/icassp.2019.8683746
2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP)
Keywords
Field
DocType
cross-lingual, voice conversion, Phonetic PosteriorGram (PPG), average modeling approach (AMA)
Cross lingual,Feature vector,Pattern recognition,Computer science,Speech recognition,Artificial intelligence,Mandarin Chinese,Network adaptation
Conference
ISSN
Citations 
PageRank 
1520-6149
3
0.36
References 
Authors
0
5
Name
Order
Citations
PageRank
Yi Zhou123032.97
Xiaohai Tian26411.83
Haihua Xu35511.41
Das, Rohan Kumar410321.58
Haizhou Li53678334.61