强化学习
汉密尔顿-雅各比-贝尔曼方程
标识符
人工神经网络
最优控制
计算机科学
跟踪误差
无人机
方案(数学)
控制理论(社会学)
人工智能
跟踪(教育)
功能(生物学)
曲面(拓扑)
控制(管理)
数学优化
数学
工程类
数学分析
生物
进化生物学
教育学
海洋工程
程序设计语言
心理学
几何学
作者
Ning Wang,Ying Gao,Hong Zhao,Choon Ki Ahn
出处
期刊:IEEE transactions on neural networks and learning systems
[Institute of Electrical and Electronics Engineers]
日期:2020-08-03
卷期号:32 (7): 3034-3045
被引量:167
标识
DOI:10.1109/tnnls.2020.3009214
摘要
In this article, a novel reinforcement learning-based optimal tracking control (RLOTC) scheme is established for an unmanned surface vehicle (USV) in the presence of complex unknowns, including dead-zone input nonlinearities, system dynamics, and disturbances. To be specific, dead-zone nonlinearities are decoupled to be input-dependent sloped controls and unknown biases that are encapsulated into lumped unknowns within tracking error dynamics. Neural network (NN) approximators are further deployed to adaptively identify complex unknowns and facilitate a Hamilton-Jacobi-Bellman (HJB) equation that formulates optimal tracking. In order to derive a practically optimal solution, an actor-critic reinforcement learning framework is built by employing adaptive NN identifiers to recursively approximate the total optimal policy and cost function. Eventually, theoretical analysis shows that the entire RLOTC scheme can render tracking errors that converge to an arbitrarily small neighborhood of the origin, subject to optimal cost. Simulation results and comprehensive comparisons on a prototype USV demonstrate remarkable effectiveness and superiority.
科研通智能强力驱动
Strongly Powered by AbleSci AI