Abstract | ||
---|---|---|
Text-to-phoneme mapping is a very important preliminary step in any text-to-speech synthesis system. In this paper, we study the performances of the multilayer perceptron (MLP) neural network for the problem of text-to-phoneme mapping. Specifically, we study the influence of the input letter encod- ing in the conversion accuracy of such system. We show, that for large network complexities the orthogonal binary codes (as introduced in NetTalk) gives better performance. On the other hand in applications that require very small memory load and computational complexity other compact codes may be more suitable. This study is a first step toward implemen- tation a neural network based text-to-phoneme mapping in mobile devices. |
Year | Venue | Keywords |
---|---|---|
2005 | Antalya | vectors,accuracy,artificial neural networks |
Field | DocType | ISBN |
NETtalk,Computer science,Binary code,Speech recognition,Mobile device,Multilayer perceptron,Time delay neural network,Artificial neural network,Encoding (memory),Computational complexity theory | Conference | 978-160-4238-21-1 |
Citations | PageRank | References |
1 | 0.40 | 4 |
Authors | ||
3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Eniko Beatrice Bilcu | 1 | 2 | 0.80 |
Jaakko Astola | 2 | 1515 | 230.41 |
Jukka Saarinen | 3 | 264 | 46.21 |