Title
A Variance Analysis for POMDP Policy Evaluation.
Abstract
Partially Observable Markov Decision Processes have been studied widely as a model for decision making under uncertainty, and a number of methods have been developed to find the solutions for such processes. Such studies often involve calculation of the value function of a specific policy, given a model of the transition and observation probabilities, and the reward. These models can be learned using labeled samples of on-policy trajectories. However, when using empirical models, some bias and variance terms are introduced into the value function as a result of imperfect models. In this paper, we propose a method for estimating the bias and variance of the value function in terms of the statistics of the empirical transition and observation model. Such error terms can be used to meaningfully compare the value of different policies. This is an important result for sequential decision-making, since it will allow us to provide more formal guarantees about the quality of the policies we implement. To evaluate the precision of the proposed method, we provide supporting experiments on problems from the field of robotics and medical decision making.
Year
DOI
Venue
2008
10.1901/jaba.2008.2-1056
AAAI
Keywords
Field
DocType
variance analysis,pomdp policy evaluation,variance term,medical decision,imperfect model,empirical model,observation probability,observation model,value function,important result,empirical transition,biomedical research,bioinformatics
Empirical modelling,Mathematical optimization,Imperfect,Observable,Computer science,Partially observable Markov decision process,Markov decision process,Bellman equation,Artificial intelligence,Machine learning,Robotics,Analysis of variance
Conference
Volume
ISSN
Citations 
2
2159-5399
5
PageRank 
References 
Authors
0.41
11
3
Name
Order
Citations
PageRank
Mahdi Milani Fard1577.19
Joelle Pineau22857184.18
Peng Sun342026.68