Title
A study on adaptation of speaking rate-dependent hierarchical prosodic model for Chinese dialect TTS
Abstract
This paper presents a new approach to developing a speaking rate (SR)-dependent hierarchical prosodic model (SR-HPM) to be utilized in a SR-controlled TTS for Taiwanese (Min-Nan) language, a resource-limited Chinese dialect. The main issue is to conquer the difficulty of building the SR-HPM directly from a Taiwanese database with sparse coverage of linguistic context, prosody and SR. By using the property that Taiwanese and Mandarin Chinese share the same linguistic characteristics, we propose an adaptation approach to constructing Taiwanese SR-HPM from a small Taiwanese corpus of fast SR with the help of an existing Mandarin SRHPM which is well-trained from a large Mandarin corpus with utterances covering a wide range of SR. The proposed method includes two parts: adaptation of normalization functions (NFs) and adaptive prosody labeling and modeling algorithm (PLM). Both of these two parts are formulated based on MAP estimations with the existing Mandarin SR-HPM serving as an informative prior. Effectiveness of the proposed approach was evaluated by an experiment of prosody generation for Taiwanese TTS using a small corpus of fast speech with SR in 4.5-6.8 syllables/sec. Experimental results showed that the generated prosody sounded quite natural for SR in a wide range of 3.4-6.8 syllables/sec.
Year
DOI
Venue
2015
10.1109/ICSDA.2015.7357862
2015 International Conference Oriental COCOSDA held jointly with 2015 Conference on Asian Spoken Language Research and Evaluation (O-COCOSDA/CASLRE)
Keywords
Field
DocType
Prosody,text-to-speech system,speaking rate,Mandarin,dialect,Taiwanese,Min-Nan
Prosody,Normalization (statistics),Computer science,Speech recognition,Natural language processing,Artificial intelligence,Hidden Markov model,Mandarin Chinese
Conference
ISSN
Citations 
PageRank 
2163-3479
0
0.34
References 
Authors
6
1
Name
Order
Citations
PageRank
Chen-Yu Chiang13111.55