Title
Robust exploration in linear quadratic reinforcement learning.
Abstract
This paper concerns the problem of learning control policies for an unknown linear dynamical system to minimize a quadratic cost function. We present a method, based on convex optimization, that accomplishes this task robustly: i.e., we minimize the worst-case cost, accounting for system uncertainty given the observed data. The method balances exploitation and exploration, exciting the system in such a way so as to reduce uncertainty in the model parameters to which the worst-case cost is most sensitive. Numerical simulations and application to a hardware-in-the-loop servo-mechanism demonstrate the approach, with appreciable performance and robustness gains over alternative methods observed in both.
Year
Venue
Keywords
2019
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019)
worst-case cost
Field
DocType
Volume
Mathematical optimization,Computer science,Linear quadratic,Reinforcement learning
Journal
32
ISSN
Citations 
PageRank 
1049-5258
0
0.34
References 
Authors
0
4
Name
Order
Citations
PageRank
Jack Umenberger194.90
Mina Ferizbegovic201.69
Thomas B. Schön374472.66
Håkan Hjalmarsson400.34