汉密尔顿-雅各比-贝尔曼方程
哈密顿量(控制论)
动态规划
贝尔曼方程
数学
强化学习
应用数学
哈密顿系统
数学优化
非线性系统
计算机科学
数学分析
物理
量子力学
人工智能
作者
Yongliang Yang,Hamidreza Modares,Kyriakos G. Vamvoudakis,Wei He,Cheng‐Zhong Xu,Donald C. Wunsch
标识
DOI:10.1109/tcyb.2021.3108034
摘要
In this article, we consider an iterative adaptive dynamic programming (ADP) algorithm within the Hamiltonian-driven framework to solve the Hamilton-Jacobi-Bellman (HJB) equation for the infinite-horizon optimal control problem in continuous time for nonlinear systems. First, a novel function, "min-Hamiltonian," is defined to capture the fundamental properties of the classical Hamiltonian. It is shown that both the HJB equation and the policy iteration (PI) algorithm can be formulated in terms of the min-Hamiltonian within the Hamiltonian-driven framework. Moreover, we develop an iterative ADP algorithm that takes into consideration the approximation errors during the policy evaluation step. We then derive a sufficient condition on the iterative value gradient to guarantee closed-loop stability of the equilibrium point as well as convergence to the optimal value. A model-free extension based on an off-policy reinforcement learning (RL) technique is also provided. Finally, numerical results illustrate the efficacy of the proposed framework.
科研通智能强力驱动
Strongly Powered by AbleSci AI