Abstract | ||
---|---|---|
In our previous study, we proposed the waveform interpolation (WI) approach to model the excitation signals for hidden Markov model (HMM)-based speech synthesis. This letter presents several techniques to improve excitation modeling within the WI framework. We propose both the time domain and frequency domain zero padding techniques to reduce the spectral distortion inherent in the synthesized excitation signal. Furthermore, we apply non-negative matrix factorization (NMF) to obtain a low-dimensional representation of the excitation signals. From a number of experiments, including a subjective listening test, the proposed method has been found to enhance the performance of the conventional excitation modeling techniques. |
Year | DOI | Venue |
---|---|---|
2013 | 10.1587/transinf.E96.D.379 | IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS |
Keywords | DocType | Volume |
HMM-based speech synthesis, waveform interpolation, principal component analysis, non-negative matrix factorization | Journal | E96D |
Issue | ISSN | Citations |
2 | 1745-1361 | 3 |
PageRank | References | Authors |
0.40 | 3 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
June Sig Sung | 1 | 9 | 2.59 |
Doo Hwa Hong | 2 | 16 | 4.55 |
Hyun Woo Koo | 3 | 4 | 0.76 |
Nam Soo Kim | 4 | 275 | 29.16 |