Title
Learning continuous-action control policies
Abstract
Reinforcement learning for control in stochastic processes has received significant attention in the last few years. Several data-efficient methods, even for continuous state spaces, have been proposed, however most of them assume a small and discrete action space. While continuous action spaces are quite common in real-world problems, the most common approach still employed in practice is coarse discretization of the action space. This paper presents a novel, computationally-efficient method, called adaptive action modification, for realizing continuous-action policies, using binary decisions corresponding to adaptive increment or decrement changes in the values of the continuous action variables. The proposed approach essentially approximates any continuous action space to arbitrary resolution and can be combined with any discrete-action reinforcement learning algorithm for learning continuous-action policies. Our approach is coupled with three well-known reinforcement learning algorithms (Q-learning, fitted Q-iteration, and least-squares policy iteration) and its use and properties are thoroughly investigated and demonstrated on the continuous state-action inverted pendulum and bicycle balancing and riding domains.
Year
DOI
Venue
2009
10.1109/ADPRL.2009.4927541
Nashville, TN
Keywords
Field
DocType
continuous systems,discrete systems,iterative methods,learning (artificial intelligence),least squares approximations,stochastic systems,Q-learning,adaptive action modification,bicycle balancing,bicycle riding,coarse discretization,computationally-efficient method,continuous action variables,continuous state spaces,continuous state-action inverted pendulum,continuous-action control policies,data-efficient methods,discrete action space,discrete-action reinforcement learning algorithm,fitted Q-iteration,least-squares policy iteration,stochastic processes
Approximation algorithm,Discretization,Inverted pendulum,Mathematical optimization,Function approximation,Iterative method,Control theory,Q-learning,Stochastic process,Mathematics,Reinforcement learning
Conference
ISBN
Citations 
PageRank 
978-1-4244-2761-1
4
0.54
References 
Authors
11
2
Name
Order
Citations
PageRank
Jason Pazis11046.97
Michail G. Lagoudakis2877.19