Abstract | ||
---|---|---|
Applications of multi-agent system like cooperative transport are found in various domains of real world. Due to the complexity inherent in multi-agent system, however, handling with preprogramming is difficult. Multi-agent reinforcement learning (MARL), which is a framework to make multiple agents in the same environment learn their policies simultaneously using reinforcement learning, is receiving attention. In the conventional MARL, although decentralization is essential for feasible learning, rewards for the agents have been allocated from a centralized system in the environment. Instead of such "top-down" MARL, to achieve the completely distributed autonomous systems, we tackle a new paradigm named "bottom-up" MARL, where the agents get their own rewards. The bottom-up MARL requires to share the respective rewards for emerging orderly group behaviors, which cannot be acquired merely by maximizing the mean of them. We therefore propose the architecture that has three components: estimating rewards of other agents; selecting rewards to reinforce from the correlation, and; promoting the exploration to find unknown correlation. The proposed architecture is verified that every element is essential by numerical simulation performed in stages. A similar task is also accomplished in dynamical simulation under the same conditions as the actual robots. |
Year | DOI | Venue |
---|---|---|
2018 | 10.1109/SMC.2018.00607 | 2018 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC) |
Keywords | Field | DocType |
Distributed autonomous system, Reinforcement leaning, Stochastic neural network | Architecture,Dynamical simulation,Computer simulation,Computer science,Stochastic neural network,Top-down and bottom-up design,Autonomous system (Internet),Artificial intelligence,Robot,Machine learning,Reinforcement learning | Conference |
ISSN | Citations | PageRank |
1062-922X | 0 | 0.34 |
References | Authors | |
0 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Takumi Aotani | 1 | 0 | 1.01 |
Taisuke Kobayashi | 2 | 31 | 10.92 |
Kenji Sugimoto | 3 | 30 | 10.35 |