汉密尔顿-雅各比-贝尔曼方程
最优控制
线性二次高斯控制
贝尔曼方程
数学
最大熵原理
动态规划
数学优化
强化学习
应用数学
Riccati方程
随机控制
控制理论(社会学)
线性二次调节器
熵(时间箭头)
代数Riccati方程
计算机科学
微分方程
数学分析
控制(管理)
人工智能
统计
物理
量子力学
作者
Jeongho Kim,Insoon Yang
出处
期刊:IEEE Transactions on Automatic Control
[Institute of Electrical and Electronics Engineers]
日期:2023-04-01
卷期号:68 (4): 2018-2033
被引量:4
标识
DOI:10.1109/tac.2022.3168168
摘要
Maximum entropy reinforcement learning methods have been successfully applied to a range of challenging sequential decision-making and control tasks. However, most of the existing techniques are designed for discrete-time systems although there has been a growing interest to handle physical processes evolving in continuous time. As a first step toward their extension to continuous-time systems, this article aims to study the theory of maximum entropy optimal control in continuous time. Applying the dynamic programming principle, we derive a novel class of Hamilton–Jacobi–Bellman (HJB) equations and prove that the optimal value function of the maximum entropy control problem corresponds to the unique viscosity solution of the HJB equation. We further show that the optimal control is uniquely characterized as Gaussian in the case of control-affine systems and that, for linear-quadratic problems, the HJB equation is reduced to a Riccati equation, which can be used to obtain an explicit expression of the optimal control. The results of our numerical experiments demonstrate the performance of our maximum entropy method in continuous-time optimal control and reinforcement learning problems.
科研通智能强力驱动
Strongly Powered by AbleSci AI