Abstract | ||
---|---|---|
This paper provides the stability analysis for a model-free action-dependent heuristic dynamic programing (HDP) approach with an eligibility trace long-term prediction parameter (λ). HDP(λ) learns from more than one future reward. Eligibility traces have long been popular in Q-learning. This paper proves and demonstrates that they are worthwhile to use with HDP. In this paper, we prove its uniform... |
Year | DOI | Venue |
---|---|---|
2019 | 10.1109/TNNLS.2018.2875870 | IEEE Transactions on Neural Networks and Learning Systems |
Keywords | DocType | Volume |
Computational modeling,Robots,Learning systems,Stability criteria,Simulation,Trajectory | Journal | 30 |
Issue | ISSN | Citations |
7 | 2162-237X | 3 |
PageRank | References | Authors |
0.37 | 12 | 2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Seaar Al-Dabooni | 1 | 10 | 1.82 |
Wunsch II Donald C. | 2 | 1354 | 91.73 |