Title
Current status of the IBM Trainable Speech Synthesis System.
Abstract
This paper describes the current status of the IBM Trainable Speech Synthesis System. The system is a state-of-the-art, trainable, unit-selection based concatenative speech synthesiser. The system uses hidden Markov models (HMMs) to provide a phonetic transcription and HMM state alignment of a database of single-speaker continuous-speech training data. The runtime synthesiser uses the HMM state sized segments that result as its basic synthesis units. It determines which segments to concatenate to produce a target sentence using decision trees built from the training data and a dynamic programming search to optimise a perceptually motivated cost function. The synthesiser can operate both in general domain Text-to-Speech mode, and in Phrase Splicing mode to provide higher quality synthesis in limited domains. Systems have been built in at least 10 different languages and over 70 voices.
Year
Venue
Field
2001
SSW
Decision tree,Speech synthesis,IBM,Phonetic transcription,Computer science,Phrase,Speech recognition,Concatenation,Hidden Markov model,Sentence
DocType
Citations 
PageRank 
Conference
11
1.61
References 
Authors
5
19
Name
Order
Citations
PageRank
Robert E. Donovan17917.28
Abraham Ittycheriah253461.23
Martin Franz348353.56
Bhuvana Ramabhadran41779153.83
Ellen Eide59619.16
Mahesh Viswanathan62264206.47
Raimo Bakis7153308.32
wael hamza819815.84
Michael Picheny91461920.15
P. Gleason10111.61
T. Rutherfoord11111.61
P. Cox12111.61
D. Green13112.28
Eric Janke14489.98
S. Revelin15111.95
Claire Waast-Richard16566.58
B. Zeller17111.61
C. Guenther18111.61
J. Kunzmann19111.61