Title
Policy Evaluation Using the Ω-Return
Abstract
We propose the Ω-return as an alternative to the λ-return currently used by the TD(λ) family of algorithms. The benefit of the Ω-return is that it accounts for the correlation of different length returns. Because it is difficult to compute exactly, we suggest one way of approximating the Ω-return. We provide empirical studies that suggest that it is superior to the λ-return and γ-return for a variety of problems.
Year
Venue
Field
2015
Annual Conference on Neural Information Processing Systems
Mathematical optimization,Computer science,Correlation,Empirical research
DocType
Citations 
PageRank 
Conference
1
0.39
References 
Authors
10
4
Name
Order
Citations
PageRank
Philip S. Thomas118422.55
S. Niekum216523.73
Georgios Theocharous314016.65
George Konidaris480159.30