Abstract | ||
---|---|---|
In this paper, a novel parametric prosody coding approach for Mandarin speech is proposed. It employs a hierarchical prosodic model (HPM) as a prosody generating model in the encoder to analyze the speech prosody of the input utterance to obtain a parametric representation of four prosodic-acoustic features of syllable pitch contour, syllable duration, syllable energy level, and syllable-juncture pause duration for encoding. In the decoder, the four prosodic-acoustic features are reconstructed by a synthesis operation using the decoded HPM parameters. The reconstructed prosodic features are lastly used in an HMM-based speech synthesizer to help to generate the reconstructed speech. Experimental results show that the reconstructed speech has good quality at low data rates of 114.9 bits/s for a speaker-dependent task. An informal listening test confirmed decoded speeches sounded very fluently. |
Year | DOI | Venue |
---|---|---|
2013 | 10.1109/IIH-MSP.2013.24 | IIH-MSP |
Keywords | Field | DocType |
new model-based prosody coder,decoded speech,hmm-based speech synthesizer,novel parametric prosody,mandarin speech,speech prosody,prosodic-acoustic feature,syllable duration,reconstructed prosodic feature,prosody generating model,reconstructed speech | Prosody,Pitch contour,Speech synthesis,Computer science,Utterance,Coding (social sciences),Speech recognition,Encoder,Syllable,Hidden Markov model | Conference |
Citations | PageRank | References |
0 | 0.34 | 4 |
Authors | ||
4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Chen-Yu Chiang | 1 | 31 | 11.55 |
Yu-Ping Hung | 2 | 0 | 0.68 |
Sin-Horng Chen | 3 | 273 | 39.86 |
Yih-Ru Wang | 4 | 237 | 34.68 |