Learning continuous-action control policies - Citegraph

Paper Info

Title
Learning continuous-action control policies

Abstract
Reinforcement learning for control in stochastic processes has received significant attention in the last few years. Several data-efficient methods, even for continuous state spaces, have been proposed, however most of them assume a small and discrete action space. While continuous action spaces are quite common in real-world problems, the most common approach still employed in practice is coarse discretization of the action space. This paper presents a novel, computationally-efficient method, called adaptive action modification, for realizing continuous-action policies, using binary decisions corresponding to adaptive increment or decrement changes in the values of the continuous action variables. The proposed approach essentially approximates any continuous action space to arbitrary resolution and can be combined with any discrete-action reinforcement learning algorithm for learning continuous-action policies. Our approach is coupled with three well-known reinforcement learning algorithms (Q-learning, fitted Q-iteration, and least-squares policy iteration) and its use and properties are thoroughly investigated and demonstrated on the continuous state-action inverted pendulum and bicycle balancing and riding domains.

Year	DOI	Venue
2009	10.1109/ADPRL.2009.4927541	Nashville, TN
Keywords	Field	DocType
continuous systems,discrete systems,iterative methods,learning (artificial intelligence),least squares approximations,stochastic systems,Q-learning,adaptive action modification,bicycle balancing,bicycle riding,coarse discretization,computationally-efficient method,continuous action variables,continuous state spaces,continuous state-action inverted pendulum,continuous-action control policies,data-efficient methods,discrete action space,discrete-action reinforcement learning algorithm,fitted Q-iteration,least-squares policy iteration,stochastic processes	Approximation algorithm,Discretization,Inverted pendulum,Mathematical optimization,Function approximation,Iterative method,Control theory,Q-learning,Stochastic process,Mathematics,Reinforcement learning	Conference
ISBN	Citations	PageRank
978-1-4244-2761-1	4	0.54
References	Authors
11	2

Authors (2 rows)

Cited by (4 rows)

References (11 rows)

Name	Order	Citations	PageRank
Jason Pazis	1	104	6.97
Michail G. Lagoudakis	2	87	7.19

1