Abstract | ||
---|---|---|
In this paper, a hierarchical prosody modeling approach for English speech is proposed. It is an extended version of the HPM approach proposed previously for Mandarin speech. It first designs a syllable-based, statistical prosodic model to describe various relationships of prosodic-acoustic features of the speech signal, linguistic features of the associated text, and prosodic tags representing the underlining prosody structure of the speech. It then employs a prosody labeling and modeling algorithm to estimate the model parameters and label the prosodic tags of all training utterances simultaneously from a prosody-unlabeled speech corpus. Experimental results on a corpus containing many paragraphic utterances of a female English-majored Chinese speaker show that the inferred parameters of the model are all meaningful. We then use the trained model to generate prosodic information for a TTS system. An informal listening test shows that the synthetic speech sounds quite natural. |
Year | DOI | Venue |
---|---|---|
2014 | 10.1109/ICSDA.2014.7051427 | O-COCOSDA |
Keywords | DocType | Citations |
linguistics,natural language processing,speech synthesis,statistical analysis,text analysis,english speech,english-majored chinese speaker,hpm approach,mandarin speech,tts,associated text,hierarchical prosody modeling,linguistic features,prosodic-acoustic features,prosody-unlabeled speech corpus,speech signal,statistical prosodic model,syllable-based prosodic model,text-to-speech,hierarchical prosodic model,prosody modeling,text to speech | Conference | 0 |
PageRank | References | Authors |
0.34 | 0 | 6 |
Name | Order | Citations | PageRank |
---|---|---|---|
chungyao tsai | 1 | 0 | 0.34 |
Chin-Kuan Kuo | 2 | 0 | 0.68 |
Yih-Ru Wang | 3 | 237 | 34.68 |
Sin-Horng Chen | 4 | 273 | 39.86 |
ibin liao | 5 | 0 | 0.68 |
Chen-Yu Chiang | 6 | 31 | 11.55 |