Building Statistical Parametric Multi-speaker Synthesis for Bangladeshi Bangla. - Citegraph

Paper Info

Title
Building Statistical Parametric Multi-speaker Synthesis for Bangladeshi Bangla.

Abstract
We present a text-to-speech (TTS) system designed for the dialect of Bengali spoken in Bangladesh. This work is part of an ongoing effort to address the needs of new under-resourced languages. We propose a process for streamlining the bootstrapping of TTS systems for under-resourced languages. First, we use crowdsourcing to collect the data from multiple ordinary speakers, each speaker recording small amount of sentences. Second, we leverage an existing text normalization system for a related language (Hindi) to bootstrap a linguistic front-end for Bangla. Third, we employ statistical techniques to construct multi-speaker acoustic models using Long Short-term Memory Recurrent Neural Network (LSTM-RNN) and Hidden Markov Model (HMM) approaches. We then describe our experiments that show that the resulting TTS voices score well in terms of their perceived quality as measured by Mean Opinion Score (MOS) evaluations.

Year	DOI	Venue
2016	10.1016/j.procs.2016.04.049	Procedia Computer Science
Keywords	Field	DocType
TTS,Bangladesh,HMM,LSTM-RNN,acoustic modeling	Crowdsourcing,Hindi,Computer science,Bootstrapping,Recurrent neural network,Speech recognition,Mean opinion score,Bengali,Artificial intelligence,Natural language processing,Hidden Markov model,Text normalization	Conference
Volume	ISSN	Citations
81	1877-0509	2
PageRank	References	Authors
0.40	15	6

Authors (6 rows)

Cited by (2 rows)

References (15 rows)

Name	Order	Citations	PageRank
Alexander Gutkin	1	2	1.08
Linne Ha	2	5	3.19
Martin Jansche	3	257	23.92
Oddur Kjartansson	4	6	4.89
Knot Pipatsrisawat	5	358	20.44
Richard Sproat	6	31	7.34

1