An Empirical Relative Value Learning Algorithm For Non-Parametric Mdps With Continuous State Space - Citegraph

Paper Info

Title
An Empirical Relative Value Learning Algorithm For Non-Parametric Mdps With Continuous State Space

Abstract
We propose an empirical relative value learning (ERVL) algorithm for non-parametric MDPs with continuous state space and finite actions and average reward criterion. The ERVL algorithm relies on function approximation via nearest neighbors, and minibatch samples for value function update. It is universal (will work for any MDP), computationally quite simple and yet provides arbitrarily good approximation with high probability in finite time. This is the first such algorithm for non-parametric (and continuous state space) MDPs with average reward criteria with these provable properties as far as we know. Numerical evaluation on a benchmark problem of optimal replacement suggests good performance.

Year	DOI	Venue
2019	10.23919/ECC.2019.8795982	2019 18TH EUROPEAN CONTROL CONFERENCE (ECC)
Field	DocType	Citations
Function approximation,Algorithm,Nonparametric statistics,Bellman equation,Relative value,State space,Mathematics,Finite time	Conference	0
PageRank	References	Authors
0.34	0	3

Authors (3 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Hiteshi Sharma	1	0	2.37
Rahul Jain	2	784	71.51
Abhishek Gupta	3	0	0.68

1