Title
On the End-to-End Solution to Mandarin-English Code-switching Speech Recognition
Abstract
Code-switching (CS) refers to a linguistic phenomenon where a speaker uses different languages in an utterance or between alternating utterances. In this work, we study end-to-end (E2E) approaches to the Mandarin-English code-switching speech recognition (CSSR) task. We first examine the effectiveness of using data augmentation and byte-pair encoding (BPE) subword units. More importantly, we propose a multitask learning recipe, where a language identification task is explicitly learned in addition to the E2E speech recognition task. Furthermore, we introduce an efficient word vocabulary expansion method for language modeling to alleviate data sparsity issues under the code-switching scenario. Experimental results on the SEAME data, a Mandarin-English CS corpus, demonstrate the effectiveness of the proposed methods.
Year
DOI
Venue
2019
10.21437/Interspeech.2019-1429
arXiv: Computation and Language
DocType
Volume
Citations 
Conference
abs/1811.00241
1
PageRank 
References 
Authors
0.36
0
6
Name
Order
Citations
PageRank
Zhiping Zeng113.06
Yerbolat Khassanov233.79
Van Tung Pham3408.42
Haihua Xu45511.41
Eng Siong Chng5970106.33
Haizhou Li63678334.61