Title
A comparison of methods for speaker-dependent pronunciation tuning for text-to-speech synthesis
Abstract
Abstract Unit-based text-to-speech (TTS) systems typically use a set of speech recordings,that have been,phonetically transcribed to create a large set of phonetic units. During synthesis, pronunciations,for input text are generated and used to guide the selection of a sequence,of phonetic,units. The style of these system,pronunciations,must,match,the style of the phonetic,transcriptions of the recorded,speech,database,in order to maximize,the quality of the synthesized,speech. Furthermore, since different speakers have different speech characteristics, supporting multiple speakers for a single language,generally,requires,applying,a speaker-dependent mapping,to speaker-independent pronunciations.,This paper investigates three automatic,methods,for this process,of speaker-dependent pronunciation tuning: word N-grams, decision trees, and transformation-based learning. Transformation-based learning achieved the best results, lowering,the phone,error rate of the text pronunciations compared,to the speech transcriptions by 26% over the error rate of the unmodified,text transcriptions.
Year
Venue
Keywords
2005
INTERSPEECH
error rate,decision tree,text to speech
Field
DocType
Citations 
Pronunciation,Speech corpus,Transcription (linguistics),Decision tree,Speech synthesis,Computer science,Word error rate,Speech recognition,Phone,Natural language processing,Text to speech synthesis,Artificial intelligence
Conference
0
PageRank 
References 
Authors
0.34
7
4
Name
Order
Citations
PageRank
gabriel webster151.30
tina burrows251.32
Kate Knill324928.02
guildhall st400.34