Title
Word-level Speech Recognition with a Dynamic Lexicon.
Abstract
We propose a direct-to-word sequence model with a dynamic lexicon. Our word network constructs word embeddings dynamically from the character level tokens. The word network can be integrated seamlessly with arbitrary sequence models including Connectionist Temporal Classification and encoder-decoder models with attention. Sub-word units are commonly used in speech recognition yet are generated without the use of acoustic context. We show our direct-to-word model can achieve word error rate gains over sub-word level models for speech recognition. Furthermore, we empirically validate that the word-level embeddings we learn contain significant acoustic information, making them more suitable for use in speech recognition. We also show that our direct-to-word approach retains the ability to predict words not seen at training time without any retraining.
Year
Venue
DocType
2019
CoRR
Journal
Volume
Citations 
PageRank 
abs/1906.04323
0
0.34
References 
Authors
0
3
Name
Order
Citations
PageRank
Ronan Collobert14002308.61
Awni Y. Hannun251727.54
Gabriel Synnaeve324016.91