Title
Exploiting Inactive Examples for Natural Language Generation With Data Rejuvenation
Abstract
Recent years have witnessed the success of natural language generation (NLG) accomplished by deep neural networks, which require a large amount of training data for optimization. With the constant increase of data scale, the complex patterns and potential noises make training NLG models difficult. In order to fully utilize large-scale training data, we explore inactive examples in the training data and propose to rejuvenate the inactive examples for improving the performance of NLG models. Specifically, we define inactive examples as those sentence pairs that contribute less to the performance of NLG models, and show that their existence is independent of model variants but mainly determined by the data distribution. We further introduce <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">data rejuvenation</i> to improve the training of NLG models by re-labeling the inactive examples. The rejuvenated examples and active examples are combined to train a final NLG model. We evaluate our approach by experiments on machine translation (MT) and text summarization (TS) tasks, and achieve significant improvements of performance. Extensive analyses reveal that inactive examples are more difficult to learn than active ones and rejuvenation can reduce the learning difficulty, which stabilizes and accelerates the training process of NLG models and results in models with better generalization capability.
Year
DOI
Venue
2022
10.1109/TASLP.2022.3153269
IEEE/ACM Transactions on Audio, Speech, and Language Processing
Keywords
DocType
Volume
Natural language generation,inactive example,data rejuvenation,machine translation,text summarization
Journal
30
Issue
ISSN
Citations 
1
2329-9290
0
PageRank 
References 
Authors
0.34
16
6
Name
Order
Citations
PageRank
Wenxiang Jiao122.39
Xing Wang200.34
Shilin He31016.89
Zhaopeng Tu451839.95
Irwin King56751325.94
Michael R. Lyu610985529.03