Title
Improved Multi-Stage Training of Online Attention-Based Encoder-Decoder Models
Abstract
In this paper, we propose a refined multi-stage multi-task training strategy to improve the performance of onli <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">ne</sup> attention-based encoder-decoder (AED) models. A three-stage training based on three levels of architectural granularity namely, character encoder, byte pair encoding (BPE) based encoder, and attention decoder, is proposed. Also, multi-task learning based on two-levels of linguistic granularity namely, character and BPE, is used. We explore different pre-training strategies for the encoders including transfer learning from a bidirectional encoder. Our encoder-decoder models with online attention show ~35% and ~10% relative improvement over their baselines for smaller and bigger models, respectively. Our models achieve a word error rate (WER) of 5.04% and 4.48% on the Librispeech test-clean data for the smaller and bigger models respectively after fusion with long short-term memory (LSTM) based external language model (LM).
Year
DOI
Venue
2019
10.1109/ASRU46091.2019.9003936
2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)
Keywords
DocType
ISBN
Attention based encoder-decoder models,online attention,multi-stage training,multi-task learning
Conference
978-1-7281-0307-5
Citations 
PageRank 
References 
0
0.34
0
Authors
6
Name
Order
Citations
PageRank
Abhinav Garg166.61
Dhananjaya Gowda235.47
Ankur N Kumar383.39
Kwangyoun Kim424.11
Mehul Kumar512.73
Chanwoo Kim625328.44