Abstract | ||
---|---|---|
In most state-of-the-art voice conversion systems, speech quality of converted utterances is still unsatisfactory. In this paper, STRAIGHT analysis-synthesis framework is used to improve the quality. A smoothed GMM and MAP adaptation is proposed for spectrum conversion to avoid the overly smooth phenomenon in the traditional GMM method. Since frames are processed independently, the GMM based transformation function may generate discontinuous features. Therefore, a time domain low pass filter is applied on the transformation function during the conversion phase. The results of listening evaluations show that the quality of the speech converted by the proposed method is significantly better than that by the traditional GMM method. Meanwhile, speaker identifiability of the converted voice reaches 75%, even when the difference between the source speaker and the target speaker is not very large. |
Year | Venue | Keywords |
---|---|---|
2003 | INTERSPEECH | time domain,low pass filter,spectrum |
Field | DocType | Citations |
Map adaptation,Pattern recognition,Computer science,Speech recognition,Artificial intelligence | Conference | 45 |
PageRank | References | Authors |
2.94 | 12 | 5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Yining Chen | 1 | 92 | 8.76 |
Min Chu | 2 | 316 | 32.29 |
Eric Chang | 3 | 625 | 49.79 |
Jia Liu | 4 | 183 | 32.42 |
RunSheng Liu | 5 | 111 | 12.56 |