Title
A syllable based statistical text to speech system
Abstract
A statistical parametric speech synthesis system uses triphones, phones or full context phones to address the problem of co-articulation. In this paper, syllables are used as the basic units in the parametric synthesiser. Conventionally full context phones in a Hidden Markov Model (HMM) based speech synthesis framework are modeled with a fixed number of states. This is because each phoneme corresponds to a single indivisible sound. On the other hand a syllable is made up of a sequence of one or more sounds. To accommodate this variation, a variable number of states are used to model a syllable. Although a variable number of states are required to model syllables, a syllable captures co-articulation well since it is the smallest production unit. A syllable based speech synthesis system therefore does not require a well designed question set. The total number of syllables in a language is quite high and all of them cannot be modeled. To address this issue, a fallback unit is modeled instead. The quality of the proposed system is comparable to that of the phoneme based system in terms of DMOS and WER.
Year
Venue
Keywords
2013
Signal Processing Conference
hidden Markov models,speech synthesis,DMOS,HMM based speech synthesis,WER,full context phones,hidden Markov model,parametric synthesiser,phoneme based system,single indivisible sound,statistical parametric speech synthesis system,syllable based speech synthesis system,syllable based statistical text to speech system,triphones,HTS,Statistical TTS,Syllable,TTS
Field
DocType
Citations 
Speech synthesis,Computer science,Speech recognition,Parametric statistics,Syllable,Hidden Markov model
Conference
0
PageRank 
References 
Authors
0.34
0
5
Name
Order
Citations
PageRank
Abhijit Pradhan100.34
Aswin Shanmugam, S.2252.90
Anusha Prakash342.61
Kamakoti Veezhinathan4354.04
Hema A. Murthy560379.54