Abstract | ||
---|---|---|
We consider the stochastic shortest path planning problem in MDPs, i.e., the problem of designing policies that ensure reaching a goal state from a given initial state with minimum accrued cost. In order to account for rare but important realizations of the system, we consider a nested dynamic coherent risk total cost functional rather than the conventional risk-neutral total expected cost. Under some assumptions, we show that optimal, stationary, Markovian policies exist and can be found via a special Bellman's equation. We propose a computational technique based on difference convex programs (DCPs) to find the associated value functions and therefore the risk-averse policies. A rover navigation MDP is used to illustrate the proposed methodology with conditional-value-at-risk (CVaR) and entropic-value-at-risk (EVaR) coherent risk measures. |
Year | DOI | Venue |
---|---|---|
2021 | 10.1109/CDC45484.2021.9683527 | CDC |
DocType | Citations | PageRank |
Conference | 0 | 0.34 |
References | Authors | |
0 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Mohamadreza Ahmadi | 1 | 3 | 4.10 |
Anushri Dixit | 2 | 0 | 0.34 |
Burdick, J.W. | 3 | 2988 | 516.87 |
Aaron D. Ames | 4 | 0 | 2.37 |