Title
Random Forests For Statistical Speech Synthesis
Abstract
The world of statistical parametric speech synthesis continues to improve with recent investigations of different machine learning techniques to better model spectrum, F0 and duration from corpora of natural speech. Traditional techniques rely on decision trees alone. This paper shows the advantages of modeling with random forests of decision trees over single trees. Improvements equivalent to more than doubling the data can be achieved, offering end users significantly better synthesis from the same data size. These techniques give proportionally more improvements on smaller datasets, particularly with voices with only 30 minutes of speech. These techniques have been tested over a wide range of voices and languages of various sizes and quality, producing significant improvements in all cases. These techniques are documented, and robustly implemented for others to use through the Dec 2014 release of the Festvox voice building toolkit, thereby directly allowing these benefits to be used in standard voices build for the Festival Speech Synthesis System and CMU Flite.
Year
Venue
Keywords
2015
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5
random forests, acoustic modeling, statistical parameteric speech synthesis
Field
DocType
Citations 
Speech synthesis,Computer science,Speech recognition,Natural language processing,Artificial intelligence,Random forest
Conference
4
PageRank 
References 
Authors
0.41
8
2
Name
Order
Citations
PageRank
Alan W. Black14391742.28
Prasanna Kumar Muthukumar2232.71