Abstract | ||
---|---|---|
Standard architectures used in instruction following often struggle on novel compositions of subgoals (e.g. navigating to landmarks or picking up objects) observed during training. We propose a modular architecture for following natural language instructions that describe sequences of diverse subgoals. In our approach, subgoal modules each carry out natural language instructions for a specific subgoal type. A sequence of modules to execute is chosen by learning to segment the instructions and predicting a subgoal type for each segment. When compared to standard, non-modular sequence-to-sequence approaches on ALFRED, a challenging instruction following benchmark, we find that modularization improves generalization to novel subgoal compositions, as well as to environments unseen in training. |
Year | Venue | DocType |
---|---|---|
2021 | NAACL-HLT | Conference |
Citations | PageRank | References |
0 | 0.34 | 0 |
Authors | ||
5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Rodolfo Corona | 1 | 0 | 0.68 |
Daniel Fried | 2 | 83 | 7.69 |
Coline Devin | 3 | 3 | 1.05 |
Dan Klein | 4 | 8083 | 495.21 |
Trevor Darrell | 5 | 22413 | 1800.67 |