Title
Generalizing to generalize: Humans flexibly switch between compositional and conjunctive structures during reinforcement learning.
Abstract
Humans routinely face novel environments in which they have to generalize in order to act adaptively. However, doing so involves the non-trivial challenge of deciding which aspects of a task domain to generalize. While it is sometimes appropriate to simply re-use a learned behavior, often adaptive generalization entails recombining distinct components of knowledge acquired across multiple contexts. Theoretical work has suggested a computational trade-off in which it can be more or less useful to learn and generalize aspects of task structure jointly or compositionally, depending on previous task statistics, but it is unknown whether humans modulate their generalization strategy accordingly. Here we develop a series of navigation tasks that separately manipulate the statistics of goal values ("what to do") and state transitions ("how to do it") across contexts and assess whether human subjects generalize these task components separately or conjunctively. We find that human generalization is sensitive to the statistics of the previously experienced task domain, favoring compositional or conjunctive generalization when the task statistics are indicative of such structures, and a mixture of the two when they are more ambiguous. These results support a normative "meta-generalization" account and suggests that people not only generalize previous task components but also generalize the statistical structure most likely to support generalization. Author summary To act in new situations, people not only have to generalize from previous experiences, but they also have to decide how to do so. One strategy is to re-use behaviors they've already learned, but this will only be helpful if all aspects of the new situation are similar enough. Alternatively, people can combine knowledge from multiple sources and devise a new plan. For example, a skilled musician may re-use the hand motions learned playing the guitar to play a different style of music on a banjo. Previous theoretical work has suggested that the best strategy is to learn from the statistics of the environment to decide how to best generalize, whereby some environments imply that all parts of a task should be re-used as a whole, whereas others suggest that different components can be generalized separately. Here, we test whether people's generalization strategy changes with their environment using three navigation tasks, in which people have to decide both where they want to go and how to get there. We varied whether it was advantageous to generalize these two pieces of information separately or together and found that people adapted their generalization in line with an optimal computational model of meta generalization. These results suggest that people not only generalize what they learn within a single task, but they also generalize their generalization strategy as well.
Year
DOI
Venue
2020
10.1371/journal.pcbi.1007720; 10.1371/journal.pcbi.1007720.r001; 10.1371/journal.pcbi.1007720.r002; 10.1371/journal.pcbi.1007720.r003; 10.1371/journal.pcbi.1007720.r004
PLOS COMPUTATIONAL BIOLOGY
DocType
Volume
Issue
Journal
16
4
ISSN
Citations 
PageRank 
1553-734X
0
0.34
References 
Authors
0
2
Name
Order
Citations
PageRank
Nicholas T. Franklin110.69
Michael J. Frank211.37