Title | ||
---|---|---|
Burmese Speech Corpus, Finite-State Text Normalization and Pronunciation Grammars with an Application to Text-to-Speech. |
Abstract | ||
---|---|---|
This paper introduces an open-source crowd-sourced multi-speaker speech corpus along with the comprehensive set of finite-state transducer (FST) grammars for performing text normalization for the Burmese (Myanmar) language. We also introduce the open-source finite-state grammars for performing grapheme-to-phoneme (G2P) conversion for Burmese. These three components are necessary (but not sufficient) for building a high-quality text-to-speech (TTS) system for Burmese, a tonal Southeast Asian language from the Sino-Tibetan family which presents several linguistic challenges. We describe the corpus acquisition process and provide the details of our finite state-based approach to Burmese text normalization and G2P. Our experiments involve building a multi-speaker TTS system based on long short term memory (LSTM) recurrent neural network (RNN) models, which were previously shown to perform well for other languages in a low-resource setting. Our results indicate that the data and grammars that we are announcing are sufficient to build reasonably high-quality models comparable to other systems. We hope these resources will facilitate speech and language research on the Burmese language, which is considered by many to be low-resource due to the limited availability of free linguistic data. |
Year | Venue | DocType |
---|---|---|
2020 | LREC | Conference |
Citations | PageRank | References |
0 | 0.34 | 0 |
Authors | ||
9 |
Name | Order | Citations | PageRank |
---|---|---|---|
Yin May Oo | 1 | 0 | 0.34 |
Theeraphol Wattanavekin | 2 | 0 | 1.35 |
Chenfang Li | 3 | 0 | 0.68 |
Pasindu De Silva | 4 | 0 | 1.35 |
Supheakmungkol SARIN | 5 | 10 | 4.45 |
Knot Pipatsrisawat | 6 | 358 | 20.44 |
Martin Jansche | 7 | 257 | 23.92 |
Oddur Kjartansson | 8 | 6 | 4.89 |
Alexander Gutkin | 9 | 1 | 6.45 |