Title
The Boundedness Conditions for Model-Free HDP(λ).
Abstract
This paper provides the stability analysis for a model-free action-dependent heuristic dynamic programing (HDP) approach with an eligibility trace long-term prediction parameter (λ). HDP(λ) learns from more than one future reward. Eligibility traces have long been popular in Q-learning. This paper proves and demonstrates that they are worthwhile to use with HDP. In this paper, we prove its uniform...
Year
DOI
Venue
2019
10.1109/TNNLS.2018.2875870
IEEE Transactions on Neural Networks and Learning Systems
Keywords
DocType
Volume
Computational modeling,Robots,Learning systems,Stability criteria,Simulation,Trajectory
Journal
30
Issue
ISSN
Citations 
7
2162-237X
3
PageRank 
References 
Authors
0.37
12
2
Name
Order
Citations
PageRank
Seaar Al-Dabooni1101.82
Wunsch II Donald C.2135491.73