Abstract | ||
---|---|---|
We consider the problem of designing policies for partially observable Markov decision processes (POMDPs) with dynamic coherent risk objectives. Synthesizing risk-averse optimal policies for POMDPs requires infinite memory and thus undecidable. To overcome this difficulty, we propose a method based on bounded policy iteration for designing stochastic but finite state (memory) controllers, which takes advantage of standard convex optimization methods. Given a memory budget and optimality criterion, the proposed method modifies the stochastic finite state controller leading to sub-optimal solutions with lower coherent risk. |
Year | DOI | Venue |
---|---|---|
2020 | 10.23919/ACC45564.2020.9147792 | 2020 AMERICAN CONTROL CONFERENCE (ACC) |
DocType | ISSN | Citations |
Conference | 0743-1619 | 0 |
PageRank | References | Authors |
0.34 | 0 | 5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Mohamadreza Ahmadi | 1 | 3 | 4.10 |
Ono Masahiro | 2 | 0 | 0.34 |
Michel D. Ingham | 3 | 90 | 8.24 |
Richard M. Murray | 4 | 12322 | 1223.70 |
Aaron D. Ames | 5 | 1202 | 136.68 |