Title | ||
---|---|---|
Mapping Language to Programs using Multiple Reward Components with Inverse Reinforcement Learning. |
Abstract | ||
---|---|---|
Mapping natural language instructions to programs that computers can process is a fundamental challenge. Existing approaches focus on likelihood-based training or using reinforcement learning to fine-tune models based on a single reward. In this paper, we pose program generation from language as Inverse Reinforcement Learning. We introduce several interpretable reward components and jointly learn (1) a reward function that linearly combines them, and (2) a policy for program generation. Fine-tuning with our approach achieves significantly better performance than competitive methods using Reinforcement Learning (RL). On the VirtualHome framework, we get improvements of up to 9.0% on the Longest Common Subsequence metric and 14.7% on recall-based metrics over previous work on this framework (Puig et al., 2018). The approach is data-efficient, showing larger gains in performance in the low-data regime. Generated programs are also preferred by human evaluators over an RL-based approach, and rated higher on relevance, completeness, and human-likeness. |
Year | Venue | DocType |
---|---|---|
2021 | EMNLP | Conference |
Volume | Citations | PageRank |
2021.findings-emnlp | 0 | 0.34 |
References | Authors | |
0 | 2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Sayan Ghosh | 1 | 17 | 8.98 |
Shashank Srivastava | 2 | 0 | 3.04 |