Title
Sequence Generation with Guider Network.
Abstract
Sequence generation with reinforcement learning (RL) has received significant attention recently. However, a challenge with such methods is the sparse-reward problem in the RL training process, in which a scalar guiding signal is often only available after an entire sequence has been generated. This type of sparse reward tends to ignore the global structural information of a sequence, causing generation of sequences that are semantically inconsistent. In this paper, we present a model-based RL approach to overcome this issue. Specifically, we propose a novel guider network to model the sequence-generation environment, which can assist next-word prediction and provide intermediate rewards for generator optimization. Extensive experiments show that the proposed method leads to improved performance for both unconditional and conditional sequence-generation tasks.
Year
Venue
DocType
2018
arXiv: Computation and Language
Journal
Volume
Citations 
PageRank 
abs/1811.00696
2
0.35
References 
Authors
21
8
Name
Order
Citations
PageRank
Ruiyi Zhang132.41
Changyou Chen236536.95
Zhe Gan331932.58
Wenlin Wang4517.06
Liqun Chen52082139.89
Dinghan Shen610810.37
Guoyin Wang7247.38
L. Carin84603339.36