Abstract | ||
---|---|---|
This paper presents an emotional voice conversion (VC) technology using non-negative matrix factorization, where parallel exemplars are introduced to encode the source speech signal and synthesize the target speech signal. The input source spectrum is decomposed into the source spectrum exemplars and their weights. By replacing source exemplars with target exemplars, the converted spectrum and FO are constructed from the target exemplars and the target FO, which is paired with exemplars. In order to reduce the computational time, we adopted non-negative matrix factorization using active Newton set algorithms to our VC method. We carried out emotional voice conversion tasks, which convert an emotional voice into a neutral voice. The effectiveness of this method was confirmed with objective and subjective evaluations. |
Year | DOI | Venue |
---|---|---|
2014 | 10.1109/APSIPA.2014.7041640 | Asia-Pacific Signal and Information Processing Association, 2014 Annual Summit and Conference |
Keywords | Field | DocType |
Newton method,matrix decomposition,speech processing,speech recognition,Newton set algorithms,emotional voice conversion technology,nonnegative matrix factorization,source speech signal,target speech signal | ENCODE,Speech processing,Pattern recognition,Computer science,Cepstrum,Matrix decomposition,Feature extraction,Speech recognition,Artificial intelligence,Non-negative matrix factorization,Hidden Markov model,Sparse matrix | Conference |
ISSN | Citations | PageRank |
2309-9402 | 4 | 0.39 |
References | Authors | |
21 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Aihara, R. | 1 | 4 | 0.73 |
Reina Ueda | 2 | 4 | 0.39 |
Tetsuya Takiguchi | 3 | 85 | 8.77 |
Yasuo Ariki | 4 | 519 | 88.94 |