Abstract | ||
---|---|---|
HEXQ is a reinforcement learning algorithm that decomposes a problem into subtasks and constructs a hierarchy using state variables. The maximum number of levels is constrained by the number of variables representing a state. In HEXQ, values learned for a subtask can be reused in different contexts if the subtasks are identical. If not, values for non-identical subtasks need to be trained separately. This paper introduces a method that tackles these two restrictions. Experimental results show that this method can save the training time dramatically. |
Year | DOI | Venue |
---|---|---|
2005 | 10.1007/11553939_79 | KES (3) |
Keywords | Field | DocType |
different context,state variable,training time,non-identical subtasks,reinforcement learning,maximum number,hidden hierarchy | Computer science,Artificial intelligence,State variable,Reinforcement learning algorithm,Hierarchy,Reinforcement learning | Conference |
Volume | ISSN | ISBN |
3683 | 0302-9743 | 3-540-28896-1 |
Citations | PageRank | References |
0 | 0.34 | 6 |
Authors | ||
3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Geoff Poulton | 1 | 72 | 13.25 |
Ying Guo | 2 | 0 | 0.34 |
Wen Lu | 3 | 25 | 3.35 |