Mandarin Electrolaryngeal Speech Voice Conversion with Sequence-to-Sequence Modeling - Citegraph

Paper Info

Title
Mandarin Electrolaryngeal Speech Voice Conversion with Sequence-to-Sequence Modeling

Abstract
The electrolaryngeal speech (EL speech) is typically spoken with an electrolarynx device that generates excitation signals to substitute human vocal fold vibrations. Because the excitation signals cannot perfectly characterize sound sources generated by vocal folds, the naturalness and intelligibility of the EL speech are inevitably worse than that of the natural speech (NL speech). To improve speech naturalness, statistical models, such as Gaussian mixture models and deep-learning-based models, have been employed for EL speech voice conversion (ELVC). The ELVC task aims to convert EL speech into NL speech through an ELVC model. To implement a frame-wise ELVC system, accurate feature alignment is crucial for model training. However, the abnormal acoustic characteristics of the EL speech cause misalignments and accordingly limit the ELVC performance. To address this issue, we propose a novel ELVC system based on sequence-to-sequence (seq2seq) modeling with text-to-speech (TTS) pretraining. The seq2seq model involves an attention mechanism to concurrently perform representation learning and alignment. Meanwhile, TTS pretraining provides efficient training with limited data. Experimental results show that the proposed ELVC system yields notable improvements in terms of standardized evaluation metrics and subjective listening tests over a well-known frame-wise ELVC system.

Year	DOI	Venue
2021	10.1109/ASRU51503.2021.9687908	2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)
Keywords	DocType	ISBN
electrolaryngeal speech,voice conversion,sequence-to-sequence learning,transformer,pretraining	Conference	978-1-6654-3740-0
Citations	PageRank	References
0	0.34	0
Authors
9

Authors (9 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Ming-Chi Yen	1	0	0.34
Wen-Chin Huang	2	0	0.34
Kazuhiro Kobayashi	3	0	0.34
Yu-Huai Peng	4	8	4.23
Shu-Wei Tsai	5	0	0.34
Yu Tsao	6	0	0.68
Tomoki Toda	7	1874	167.18
Jyh-Shing Roger Jang	8	525	56.34
Hsin-Min Wang	9	0	1.35

1