Abstract | ||
---|---|---|
In this paper we describe the current status of the trainable text-to-speech system at IBM. Recent algorithmic and database changes to the system have led to significant gains in the output quality. On the algorithms side, we have introduced statistical models for predicting pitch and duration targets which replace the rule-based target generation previously employed. Additionally, we have changed the cost function and the search strategy, introduced a post-search pitch smoothing algorithm, and improved our method of preselection. Through the combined data and algorithmic contributions, we have been able to significantly improve (p < 0.0001) the mean opinion score (MOS) of our female voice, from 3.68 to 4.85 when heard over loudspeakers and to 5.42 when heard over the telephone (seven point scale). |
Year | DOI | Venue |
---|---|---|
2003 | 10.1109/ICASSP.2003.1198879 | ICASSP (1) |
Keywords | Field | DocType |
duration,search strategy,statistical models,mean opinion score,frequency estimation,smoothing methods,statistical analysis,post-search pitch smoothing algorithm,speech synthesis,search problems,ibm trainable speech synthesis system,database changes,text-to-speech system,prediction theory,pitch prediction,output quality,preselection,cost function,algorithmic changes,stress,databases,decision trees,text to speech,rule based,statistical model,signal generators,knowledge based systems,speech processing | Decision tree,Speech processing,IBM,Computer science,Artificial intelligence,Loudspeaker,Speech synthesis,Pattern recognition,Mean opinion score,Speech recognition,Smoothing,Statistical model,Machine learning | Conference |
Volume | ISSN | ISBN |
1 | 1520-6149 | 0-7803-7663-3 |
Citations | PageRank | References |
22 | 2.28 | 4 |
Authors | ||
11 |
Name | Order | Citations | PageRank |
---|---|---|---|
eric n eide | 1 | 22 | 2.28 |
adam m aaron | 2 | 23 | 2.65 |
Raimo Bakis | 3 | 153 | 308.32 |
rami cohen | 4 | 22 | 2.28 |
Robert E. Donovan | 5 | 79 | 17.28 |
wael hamza | 6 | 198 | 15.84 |
tom mathes | 7 | 22 | 2.28 |
Michael Picheny | 8 | 1461 | 920.15 |
melanie diane polkosky | 9 | 22 | 2.28 |
m b smith | 10 | 22 | 2.28 |
Mahesh Viswanathan | 11 | 2264 | 206.47 |