Abstract | ||
---|---|---|
n increasingly common scenario in building speech synthesis and recognition systems is training on inhomogeneous data. This paper proposes a new framework for estimating hidden Markov models on data containing both multiple speakers and multiple languages. The proposed framework, speaker and language factorization, attempts to factorize speaker-/language-specific characteristics in the data and then model them using separate transforms. Language-specific factors in the data are represented by transforms based on cluster mean interpolation with cluster-dependent decision trees. Acoustic variations caused by speaker characteristics are handled by transforms based on constrained maximum-likelihood linear regression. Experimental results on statistical parametric speech synthesis show that the proposed framework enables data from multiple speakers in different languages to be used to: train a synthesis system; synthesize speech in a language using speaker characteristics estimated in a different language; and adapt to a new language. |
Year | DOI | Venue |
---|---|---|
2012 | 10.1109/TASL.2012.2187195 | IEEE Transactions on Audio, Speech, and Language Processing |
Keywords | Field | DocType |
decision trees,matrix decomposition,hidden markov model,speech synthesis,interpolation,regression analysis,hidden markov models,decision tree,speech recognition | Speech synthesis,Pattern recognition,Computer science,Markov model,Matrix decomposition,Interpolation,Speech recognition,Speaker recognition,Parametric statistics,Artificial intelligence,Constructed language,Hidden Markov model | Journal |
Volume | Issue | ISSN |
20 | 6 | 1558-7916 |
Citations | PageRank | References |
12 | 0.58 | 23 |
Authors | ||
7 |
Name | Order | Citations | PageRank |
---|---|---|---|
Heiga Zen | 1 | 1922 | 103.73 |
Norbert Braunschweiler | 2 | 59 | 8.47 |
Sabine Buchholz | 3 | 56 | 3.96 |
Mark J. F. Gales | 4 | 3905 | 367.45 |
Kate Knill | 5 | 249 | 28.02 |
Sacha Krstulovic | 6 | 106 | 11.97 |
Javier Latorre | 7 | 61 | 5.09 |