Abstract | ||
---|---|---|
In many POMDP applications in computational sustainability, it is important that the computed policy have a simple description, so that it can be easily interpreted by stakeholders and decision makers. One measure of simplicity for POMDP value functions is the number of α-vectors required to represent the value function. Existing POMDP methods seek to optimize the accuracy of the value function, which can require a very large number of α-vectors. This paper studies methods that allow the user to explore the tradeoff between the accuracy of the value function and the number of α-vectors. Building on previous point-based POMDP solvers, this paper introduces a new algorithm (α-min) that formulates a Mixed Integer Linear Program (MILP) to calculate approximate solutions for finite-horizon POMDP problems with limited numbers of α-vectors. At each time-step, α-min calculates α-vectors to greedily minimize the gap between current upper and lower bounds of the value function. In doing so, good upper and lower bounds are quickly reached allowing a good approximation of the problem with few α-vectors. Experimental results show that α-min provides good approximate solutions given a fixed number of α-vectors on small benchmark problems, on a larger randomly generated problem, as well as on a computational sustainability problem to best manage the endangered Sumatran tiger. |
Year | Venue | Field |
---|---|---|
2015 | IJCAI | Integer,Mathematical optimization,Partially observable Markov decision process,Computational sustainability,Upper and lower bounds,Computer science,Bellman equation,Large numbers,Linear programming,Solver |
DocType | Citations | PageRank |
Conference | 0 | 0.34 |
References | Authors | |
13 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Yann Dujardin | 1 | 0 | 0.68 |
Thomas G. Dietterich | 2 | 9336 | 1722.57 |
Iadine Chades | 3 | 36 | 6.00 |