趋同(经济学)
鞍点
水准点(测量)
控制理论(社会学)
最优控制
理论(学习稳定性)
零(语言学)
微分博弈
非线性系统
数学优化
扰动(地质)
计算机科学
控制(管理)
零和博弈
有界函数
数学
马鞍
最优化问题
纳什均衡
人工智能
哲学
大地测量学
地理
几何学
经济
古生物学
数学分析
物理
机器学习
生物
量子力学
经济增长
语言学
作者
Hamidreza Modares,Frank L. Lewis,Mohammad Bagher Naghibi Sistani
摘要
SUMMARY In this paper, we present an online learning algorithm to find the solution to the H ∞ control problem of continuous‐time systems with input constraints. A suitable nonquadratic functional is utilized to encode the input constraints into the H ∞ control problem, and the related H ∞ control problem is formulated as a two‐player zero‐sum game with a nonquadratic performance. Then, a policy iteration algorithm on an actor–critic–disturbance structure is developed to solve the Hamilton–Jacobi–Isaacs (HJI) equation associated with this nonquadratic zero‐sum game. That is, three NN approximators, namely, actor, critic, and disturbance, are tuned online and simultaneously for approximating the HJI solution. The value of the actor and disturbance policies is approximated continuously by the critic NN, and then on the basis of this value estimate, the actor and disturbance NNs are updated in real time to improve their policies. The disturbance tries to make the worst possible disturbance, whereas the actor tries to make the best control input. A persistence of excitation condition is shown to guarantee convergence to the optimal saddle point solution. Stability of the closed‐loop system is also guaranteed. A simulation on a nonlinear benchmark problem is performed to validate the effectiveness of the proposed approach. Copyright © 2012 John Wiley & Sons, Ltd.
科研通智能强力驱动
Strongly Powered by AbleSci AI