Abstract | ||
---|---|---|
End-to-end speech recognition systems incorporating deep neural networks (DNNs) have achieved good results. We propose applying CTC (Connectionist Temporal Classification) models and attention-based encoder-decoder in automatic recognition of the Russian continuous speech. We used different neural network models such Long short-term memory (LSTM), bidirectional LSTM and Residual Networks to provide experiments. We got recognition accuracy a bit worse than hybrid models but our models can work without large language model and they showed better performance in terms of average decoding speed that can be helpful in real systems. Experiments are performed with extra-large vocabulary (more than 150K words) of Russian speech. |
Year | DOI | Venue |
---|---|---|
2018 | 10.1007/978-3-319-99579-3_40 | Lecture Notes in Artificial Intelligence |
Keywords | Field | DocType |
End-to-end models,Deep learning,Russian speech,Speech recognition | Residual,End-to-end principle,Computer science,Speech recognition,Artificial intelligence,Decoding methods,Deep learning,Artificial neural network,Vocabulary,Connectionism,Language model | Conference |
Volume | ISSN | Citations |
11096 | 0302-9743 | 0 |
PageRank | References | Authors |
0.34 | 16 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Nikita Markovnikov | 1 | 0 | 0.68 |
Irina S. Kipyatkova | 2 | 72 | 14.65 |
Elena E. Lyakso | 3 | 25 | 8.99 |