Abstract | ||
---|---|---|
The problem of slow convergence speed and low efficiency of experience exploitation in SARSA(lambda) learning is analyzed. And then the least-squares approximation model of the state-action pair's value function is constructed according to current and previous experiences. A set of linear equations is derived, which is satisfied by the weight vector of function approximator on a set of basis. Thus the fast and practical least-squares SARSA(lambda) algorithm and improved recursive algorithm are proposed. The experiment of inverted pendulum demonstrates that these algorithms can effectively improve convergence speed and the efficiency of experience exploitation. |
Year | DOI | Venue |
---|---|---|
2008 | 10.1109/ICNC.2008.694 | ICNC |
Keywords | Field | DocType |
inverted pendulum,practical least-squares sarsa,reinforcement learning,learning (artificial intelligence),linear equations,improved recursive algorithm,slow convergence speed,convergence speed,function approximation,previous experience,experience exploitation,least squares approximations,least-squares sarsa,value function,function approximator,least-squares approximation model,least-squares sarsa(lambda) algorithms,low efficiency,state-action pair value function,learning artificial intelligence,recursive algorithm,least square,least squares approximation,satisfiability | Least squares,Convergence (routing),Recursion (computer science),Function approximation,Computer science,Artificial intelligence,Reinforcement learning,Inverted pendulum,Mathematical optimization,Least squares support vector machine,Algorithm,Weight,Machine learning | Conference |
Volume | ISBN | Citations |
2 | 978-0-7695-3304-9 | 1 |
PageRank | References | Authors |
0.38 | 6 | 2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Shenglei Chen | 1 | 18 | 4.05 |
Yan-Mei Wei | 2 | 1 | 0.38 |