Abstract | ||
---|---|---|
Reinforcement learning (RL) has proven to be a powerful paradigm for deriving complex behaviors from simple reward signals in a wide range of environments. When applying RL to continuous control agents in simulated physics environments, the body is usually considered to be part of the environment. However, during evolution the physical body of biological organisms and their controlling brains are co -evolved, thus exploring a much larger space of actuator/controller configurations. Put differently, the intelligence does not reside only in the agent's mind, but also in the design of their body. We propose a method for uncovering strong agents, consisting of a good combination of a body and policy, based on combining RL with an evolutionary procedure. Given the resulting agent, we also propose an approach for identifying the body changes that contributed the most to the agent performance. We use the Shapley value from cooperative game theory to find the fair contribution of individual components, taking into account synergies between components. We evaluate our methods in an environment similar to the the recently proposed Robo-Sumo task, where agents in a software physics simulator compete in tipping over their opponent or pushing them out of the arena. Our results show that the proposed methods are indeed capable of generating strong agents, significantly outperforming baselines that focus on optimizing the agent policy alone. A video is available at: https://youtu.be/CH1ecRim9PI |
Year | DOI | Keywords |
---|---|---|
2019 | 10.5555/3306127.3331813 | Reinforcement Learning,Evolutionary Computation |
Field | DocType | Citations |
Control theory,Computer science,Policy learning,Shapley value,Evolutionary computation,Physical body,Cooperative game theory,Artificial intelligence,Machine learning,Actuator,Reinforcement learning | Conference | 1 |
PageRank | References | Authors |
0.34 | 0 | 8 |
Name | Order | Citations | PageRank |
---|---|---|---|
Dylan Banarse | 1 | 1 | 0.34 |
Yoram Bachrach | 2 | 1262 | 79.07 |
Siqi Liu | 3 | 55 | 4.94 |
Guy Lever | 4 | 2 | 0.68 |
Nicolas Heess | 5 | 1762 | 94.77 |
Chrisantha Fernando | 6 | 314 | 24.46 |
Pushmeet Kohli | 7 | 7398 | 332.84 |
Graepel, Thore | 8 | 5 | 4.10 |