Title
The HW-TSC's Offline Speech Translation System for IWSLT 2022 Evaluation.
Abstract
This paper describes the HW-TSC’s designation of the Offline Speech Translation System submitted for IWSLT 2022 Evaluation. We explored both cascade and end-to-end system on three language tracks (en-de, en-zh and en-ja), and we chose the cascade one as our primary submission. For the automatic speech recognition (ASR) model of cascade system, there are three ASR models including Conformer, S2T-Transformer and U2 trained on the mixture of five datasets. During inference, transcripts are generated with the help of domain controlled generation strategy. Context-aware reranking and ensemble based anti-interference strategy are proposed to produce better ASR outputs. For machine translation part, we pretrained three translation models on WMT21 dataset and fine-tuned them on in-domain corpora. Our cascade system shows competitive performance than the known offline systems in the industry and academia.
Year
DOI
Venue
2022
10.18653/v1/2022.iwslt-1.20
International Conference on Spoken Language Translation (IWSLT)
DocType
Volume
Citations 
Conference
Proceedings of the 19th International Conference on Spoken Language Translation (IWSLT 2022)
0
PageRank 
References 
Authors
0.34
0
11
Name
Order
Citations
PageRank
Minghan Wang102.03
Jiaxin Guo204.73
Xiaosong Qiao301.01
Yuxia Wang402.70
Daimeng Wei505.07
Chang Su603.38
Yimeng Chen704.06
Min Zhang81849157.00
Shimin Tao904.73
Hao Yang1007.44
Ying Qin1105.75