Title
Prosody-aware subword embedding considering Japanese intonation systems and its application to DNN-based multi-dialect speech synthesis
Abstract
This paper presents prosody-aware subword embedding considering Japanese intonation systems and its application to DNN (deep neural network)-based multi-dialect speech synthesis. In accordance with recent improvements of speech synthesis in rich-resourced languages, the research trend is shifting to more challenging languages such as Japanese dialects that still have undefined prosodic contexts. Conventional prosody-aware word embedding can unsupervisedly extract the contexts in a data-driven manner using words and <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$F_{0}$</tex> sequences. However, accurate contexts for unknown words are difficult to generate. To solve this problem, we propose prosody-aware subword embedding considering Japanese intonation systems. The unsupervised subword model, which is trained considering language and acoustic characteristics, can tokenize an unknown word into known subwords suitable for prosody-aware embedding. We also propose a modulation filtering method considering intra-subword moras to improve the embedding accuracies. We apply the methods to not only Japanese but also Japanese multi-dialect speech synthesis. In the multi-dialect case, we propose subword models shared among dialects and embedding models conditioned by dialect information. The experimental evaluation demonstrates that the proposed multi-dialect methods can improve speech quality in some Japanese dialects.
Year
DOI
Venue
2018
10.23919/APSIPA.2018.8659465
2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)
Keywords
Field
DocType
Speech synthesis,Modulation,Training,Context modeling,Training data,Feature extraction,Data models
Prosody,Data modeling,Speech synthesis,Embedding,Computer science,Feature extraction,Speech recognition,Context model,Word embedding,Artificial neural network
Conference
ISSN
ISBN
Citations 
2309-9402
978-9-8814-7685-2
0
PageRank 
References 
Authors
0.34
0
3
Name
Order
Citations
PageRank
Takanori Akiyama100.68
Shinnosuke Takamichi27522.08
Saruwatari, H.365290.81