Title
Decentralized optimal large scale multi-player pursuit-evasion strategies: A mean field game approach with reinforcement learning
Abstract
In this paper, the intelligent design for the pursuit-evasion game with large scale multi-pursuer and multi-evader has been investigated. Due to the vast number of agents, the notorious ”Curse of Dimensionality” can seriously challenge the traditional design in multi-player pursuit-evasion game, especially under harsh environment with limited communication resource to support information exchange among multi-players. To address this intractable challenge, the emerging Mean Field Games (MFG) theory has been utilized to solve the optimal pursuit-evasion strategies based on a new form of probability density function (PDF) instead of detailed information from all the other players/agents. As such, not only the information exchange is reduced, but also the computation dimension for the optimal strategy derivation is decreased. Specifically, the MFG has been integrated into the pursuit-evasion game to generate a hierarchical structure where the pursuers and the evaders form two mean field groups separately. To online solve the mean field equations, i.e., two coupled partial differential equations, the actor-critic reinforcement learning mechanism is adopted and further extended to a novel actor-critic-mass-opponent (ACMO) approach. In ACMO, the actor neural network estimates the optimal control, the critic neural network approximates the optimal cost function, the mass neural network learns the agent’s group PDF, and the opponent neural network predicts the opponents’ average states in the form of PDF that causes maximum cost for the agent’s group. The Lyapunov theory is utilized to provide the convergence analysis for all neural networks and the stability analysis for the closed-loop system. Eventually, a series of numerical simulations are conducted to demonstrate the effectiveness of the developed scheme.
Year
DOI
Venue
2022
10.1016/j.neucom.2021.01.141
Neurocomputing
Keywords
DocType
Volume
Approximate dynamic programming,Optimal control,Mean field game theory,Pursuit-evasion game,Reinforcement learning
Journal
484
ISSN
Citations 
PageRank 
0925-2312
0
0.34
References 
Authors
0
2
Name
Order
Citations
PageRank
Zejian Zhou123.42
Hao Xu200.34