How Does Distilled Data Complexity Impact the Quality and Confidence of Non-Autoregressive Machine Translation? - Citegraph

Paper Info

Title
How Does Distilled Data Complexity Impact the Quality and Confidence of Non-Autoregressive Machine Translation?

Abstract
While non-autoregressive (NAR) models are showing great promise for machine translation, their use is limited by their dependence on knowledge distillation from autoregressive models. To address this issue, we seek to understand why distillation is so effective. Prior work suggests that distilled training data is less complex than manual translations. Based on experiments with the Levenshtein Transformer and the Mask-Predict NAR models on the WMT14 German-English task, this paper shows that different types of complexity have different impacts: while reducing lexical diversity and decreasing reordering complexity both help NAR learn better alignment between source and target, and thus improve translation quality, lexical diversity is the main reason why distillation increases model confidence, which affects the calibration of different NAR models differently.

Year	Venue	DocType
2021	ACL/IJCNLP	Conference
Volume	Citations	PageRank
2021.findings-acl	0	0.34
References	Authors
0	4

Authors (4 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Weijia Xu	1	0	5.75
Shuming Ma	2	83	15.92
Dongdong Zhang	3	241	28.73
Marine Carpuat	4	587	51.99

1