强化学习
弹道
计算机科学
控制理论(社会学)
跟踪(教育)
人工神经网络
过程(计算)
跟踪误差
最优控制
控制系统
功能(生物学)
跟踪系统
航程(航空)
控制工程
人工智能
控制(管理)
工程类
数学
数学优化
卡尔曼滤波器
教育学
天文
航空航天工程
物理
电气工程
操作系统
生物
进化生物学
心理学
标识
DOI:10.1109/aeeca55500.2022.9919069
摘要
For the purpose of achieving high dynamic and precise tracking of desired trajectory, an optimal trajectory tracking control strategy with specified performance based on reinforcement learning was proposed for the unmanned surface vehicle (USV) trajectory tracking system. For the MIMO discrete-time system of USV, in order to constrain its tracking dynamic error within the expected specified range to ensure high dynamic performance during the trajectory tracking process, the system performance index and the long-term cost function were designed to measure the performance. On this basis, a control system based on the Reinforcement Learning Actor-critic framework is constructed, in which two neural networks are applied. The actor NN is used to generate the optimal control signal, and the critic NN is used to evaluate the performance of the USV while approximating the cost function and measuring actor NN. The weight of the two NNs is directly adjusted during the operation of the USV. A strict theoretical analysis is given for the designed control system, and it is proved that the closed-loop system is stable, and all closed-loop signals are semi-globally consistent and ultimately bounded. Finally, it is verified by simulation that the control system can well realize the trajectory tracking of the USV model.
科研通智能强力驱动
Strongly Powered by AbleSci AI