On the convergence of reinforcement learning with Monte Carlo Exploring Starts - Citegraph

Paper Info

Title
On the convergence of reinforcement learning with Monte Carlo Exploring Starts

Abstract
A basic simulation-based reinforcement learning algorithm is the Monte Carlo Exploring Starts (MCES) method, also known as optimistic policy iteration, in which the value function is approximated by simulated returns and a greedy policy is selected at each iteration. The convergence of this algorithm in the general setting has been an open question. In this paper, we investigate the convergence of this algorithm for the case with undiscounted costs, also known as the stochastic shortest path problem. The results complement existing partial results on this topic and thereby help further settle the open problem.

Year	DOI	Venue
2021	10.1016/j.automatica.2021.109693	Automatica
Keywords	DocType	Volume
Reinforcement learning,Markov decision processes,Stochastic control,Monte Carlo Exploring Starts,Optimistic policy iteration,Convergence,Stochastic shortest path problem	Journal	129
Issue	ISSN	Citations
1	0005-1098	0
PageRank	References	Authors
0.34	0	1

Authors (1 rows)

Cited by (0 rows)

References (0 rows)

Name	Order	Citations	PageRank
Jun Liu	1	215	20.63

1