Abstract | ||
---|---|---|
Speaker anonymization is a method of protecting voice privacy by concealing individual speaker characteristics while preserving linguistic information. The VoicePrivacy Challenge 2020 was initiated to generalize the task of speaker anonymization. In the challenge, two frameworks for speaker anonymization were introduced; in this study, we propose a method of improving the primary framework by modifying the state-of-the-art speaker individuality feature (namely, x-vector) in a neural waveform speech synthesis model. Our proposed method is constructed based on x-vector singular value modification with a clustering model. We also propose a technique of modifying the fundamental frequency and speech duration to enhance the anonymization performance. To evaluate our method, we carried out objective and subjective tests. The overall objective test results show that our proposed method improves the anonymization performance in terms of the speaker verifiability, whereas the subjective evaluation results show improvement in terms of the speaker dissimilarity. The intelligibility and naturalness of the anonymized speech with speech prosody modification were slightly reduced (less than 5% of word error rate) compared to the results obtained by the baseline system. |
Year | DOI | Venue |
---|---|---|
2022 | 10.1016/j.csl.2021.101326 | COMPUTER SPEECH AND LANGUAGE |
Keywords | DocType | Volume |
Speaker anonymization, X-vector singular value, Fundamental frequency, Clustering, Subjective evaluation | Journal | 73 |
ISSN | Citations | PageRank |
0885-2308 | 0 | 0.34 |
References | Authors | |
0 | 5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Candy Olivia Mawalim | 1 | 0 | 2.03 |
Kasorn Galajit | 2 | 0 | 1.01 |
Jessada Karnjana | 3 | 2 | 3.42 |
Shunsuke Kidani | 4 | 0 | 0.68 |
Masashi Unoki | 5 | 0 | 0.34 |