A Study of Syntactic Multi-Modality in Non-Autoregressive Machine Translation - Citegraph

Paper Info

Title
A Study of Syntactic Multi-Modality in Non-Autoregressive Machine Translation

Abstract
It is difficult for non-autoregressive translation (NAT) models to capture the multi-modal distribution of target translations due to their conditional independence assumption, which is known as the "multi-modality problem", including the lexical multi-modality and the syntactic multi-modality. While the first one has been well studied, the syntactic multi-modality brings severe challenge to the standard cross entropy (XE) loss in NAT and is under studied. In this paper, we conduct a systematic study on the syntactic multi-modality problem. Specifically, we decompose it into short- and long-range syntactic multi-modalities and evaluate several recent NAT algorithms with advanced loss functions on both carefully designed synthesized datasets and real datasets. We find that the Connectionist Temporal Classification (CTC) loss and the Order-Agnostic Cross Entropy (OAXE) loss can better handle short- and long-range syntactic multi-modalities respectively. Furthermore, we take the best of both and design a new loss function to better handle the complicated syntactic multi-modality in real-world datasets. To facilitate practical usage, we provide a guide to use different loss functions for different kinds of syntactic multi-modality.

Year	DOI	Venue
2022	10.18653/V1/2022.NAACL-MAIN.126	North American Chapter of the Association for Computational Linguistics (NAACL)
DocType	Citations	PageRank
Conference	0	0.34
References	Authors
0	7

Authors (7 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Kexun Zhang	1	0	0.34
Wang Rui	2	0	1.69
Xu Tan	3	88	23.94
Junliang Guo	4	7	2.44
Ren, Yi	5	10	4.35
Tao Qin	6	2384	147.25
Tie-yan Liu	7	4662	256.32

1