Abstract | ||
---|---|---|
We consider a finite-state, finite-action, infinite-horizon, discounted reward Markov decision process and study the bias and variance in the value function estimates that result from empirical estimates of the model parameters. We provide closed-form approximations for the bias and variance, which can then be used to derive confidence intervals around the value function estimates. We illustrate and validate our findings using a large database describing the transaction and mailing histories for customers of a mail-order catalog firm. |
Year | DOI | Venue |
---|---|---|
2007 | 10.1287/mnsc.1060.0614 | Management Science |
Keywords | Field | DocType |
variance approximation,confidence interval,model parameter,value function,variance,mailing history,discounted reward markov decision,large database,closed-form approximation,mail-order catalog firm,value function estimate,empirical estimate,bias,value function estimates,markov decision process | Transaction processing,Econometrics,Markov process,Function approximation,Markov decision process,Bellman equation,Statistics,Confidence interval,Database transaction,Finite horizon,Mathematics | Journal |
Volume | Issue | ISSN |
53 | 2 | 0025-1909 |
Citations | PageRank | References |
43 | 2.58 | 6 |
Authors | ||
4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Shie Mannor | 1 | 3340 | 285.45 |
Duncan I. Simester | 2 | 260 | 20.45 |
Peng Sun | 3 | 420 | 26.68 |
John N. Tsitsiklis | 4 | 5300 | 621.34 |