Title
Reward Tuning for self-adaptive Policy in MDP based Distributed Decision-Making to ensure a Safe Mission Planning
Abstract
Markov Decision Process (MDP) becomes a standard model for sequential decision making under uncertainty. This planning gives the appropriate sequence of actions to perform the goal of the mission in an efficient way. Often a single agent makes decisions and performs a single action. However, in several fields such as robotics several actions can be executed simultaneously. Moreover, with the increase of the complexity of missions, the decomposition of an MDP into several sub-MDPs becomes necessary. The decomposition involves parallel decisions between different agents, but the execution of concurrent actions can lead to conflicts. In addition, problems due to the system and to sensor failures may appear during the mission; these can lead to negative consequences (e.g. crash of a UAV caused by the drop in battery charge). In this article, we present a new method to prevent behavior conflicts that can appear within distributed decision-making and to emphasize the action selection if needed to ensure the safety and the various requirements of the system. This method takes into consideration the different constraints due to antagonist actions and wile additionally considering some thresholds on transition functions to promote specific actions that guarantee the safety of the system. Then it automatically computes the rewards of the different MDPs related to the mission in order to establish a safe planning. We validate this method on a case study of UAV mission such as a tracking mission. From the list of the constraints identified for the mission, the rewards of the MDPs are recomputed in order to avoid all potential conflicts and violation of constraints related to the safety of the system, thereby ensuring a safe specification of the mission.
Year
DOI
Venue
2020
10.1109/DSN-W50199.2020.00025
2020 50th Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops (DSN-W)
Keywords
DocType
ISSN
Markov Decision Process,Concurrent Actions,Reward Tuning,Behavior Conflicts,Constraints on MDPs
Conference
2325-6648
ISBN
Citations 
PageRank 
978-1-7281-7264-4
0
0.34
References 
Authors
3
3