End-to-end Tibetan Speech Synthesis Based on Phones and Semi-syllables - Citegraph

Paper Info

Title
End-to-end Tibetan Speech Synthesis Based on Phones and Semi-syllables

Abstract
Due to the 2D architecture of Tibetan characters, it is not convenient to treat the letters sequences as the input of the end-to-end speech synthesis system. The experiments are conducted based on phones and semi-syllables sequences respectively. In training and testing, the text is segmented into a sequence of syllables first, then syllables are transformed into phones and semi-syllables as the input sequence of the model. The results demonstrate the encoding and decoding alignment effect of Tibetan speech synthesis based on phones is better than that based on semi-syllables. In addition, the Highway network in the architecture plays a key role in the convergence of the model.

Year	DOI	Venue
2019	10.1109/APSIPAASC47483.2019.9023093	Asia-Pacific Signal and Information Processing Association Annual Summit and Conference
DocType	ISSN	Citations
Conference	2309-9402	0
PageRank	References	Authors
0.34	0	4

Authors (4 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Guan-Yu Li	1	2	4.42
Lisai Luo	2	0	0.34
Chunwei Gong	3	0	0.34
Shiliang Lv	4	0	0.34

1