Title
Learning From Actor-Critic Algorithm With Application to Asymmetric Tailored Performance Tracking Control of Underactuated Surface Vehicle
Abstract
The most prominent feature of human intelligence is its ability to “learn by doing”. Humans receive and utilize feedback signal from the environment to adjust the strategy so that they can complete a specific task. In this paper, a prescribed tracking performance at both transient and steady states problem of underactuated surface vehicle (USV) is proposed. To address this problem, an asymmetric tan-type barrier lyapunov function is established. Moreover, the virtual controller is designed to ensure the tailered position tracking performance of USV. In order to “learn” the unknow nonlinearities in the USV's dynamics, an online actor-critic learning scheme consists of an actor neural network (NN) and a critic NN is put forward. Different to the conventional method constrained by persistent excitation (PE) conditions, our idea is to divide the USV's navigation process into training phase and task execution phase. In the training phase, we use the actor-critic algorithm improved by novel tunning laws to learn the unknow dynamics by specifying a quasi-periodic reference orbit and PE condition is relaxed to interval excitation (IE). Then in the task execution phase, high control accuracy can be maintained when the USV starts to perform some complex mission by “learned” optimal NN weight even if the reference orbit isn't a periodic orbit or a quasi-periodic orbit any more. Finally, simulation examples are carried out to show the effectiveness of our approach.
Year
DOI
Venue
2019
10.1109/ICARM.2019.8834334
2019 IEEE 4th International Conference on Advanced Robotics and Mechatronics (ICARM)
Keywords
Field
DocType
actor-critic algorithm,asymmetric tailored performance tracking control,underactuated surface vehicle,human intelligence,feedback signal,prescribed tracking performance,virtual controller,tailered position tracking performance,actor neural network,critic NN,training phase,task execution phase,quasiperiodic reference orbit,PE condition,learned optimal NN weight,quasiperiodic orbit,tan-type barrier Lyapunov function,excitation conditions,USV navigation process,USV dynamics,interval excitation
Orbit,Control theory,Human intelligence,Computer science,Barrier lyapunov function,Algorithm,Underactuation,Periodic orbits,Artificial neural network
Conference
ISBN
Citations 
PageRank 
978-1-7281-0065-4
0
0.34
References 
Authors
9
3
Name
Order
Citations
PageRank
Ruiqi Mao100.34
Shouxu Zhang200.34
Yintao Wang3141.04