模型预测控制
强化学习
背景(考古学)
控制理论(社会学)
计算机科学
非线性系统
理论(学习稳定性)
最优控制
方案(数学)
控制(管理)
控制工程
数学优化
数学
工程类
人工智能
机器学习
生物
物理
数学分析
古生物学
量子力学
作者
Sébastien Gros,Mario Zanon
出处
期刊:IEEE Transactions on Automatic Control
[Institute of Electrical and Electronics Engineers]
日期:2020-02-01
卷期号:65 (2): 636-648
被引量:153
标识
DOI:10.1109/tac.2019.2913768
摘要
Reinforcement learning (RL) is a powerful tool to perform data-driven optimal control without relying on a model of the system. However, RL struggles to provide hard guarantees on the behavior of the resulting control scheme. In contrast, nonlinear model predictive control (NMPC) and economic NMPC (ENMPC) are standard tools for the closed-loop optimal control of complex systems with constraints and limitations, and benefit from a rich theory to assess their closed-loop behavior. Unfortunately, the performance of (E)NMPC hinges on the quality of the model underlying the control scheme. In this paper, we show that an (E)NMPC scheme can be tuned to deliver the optimal policy of the real system even when using a wrong model. This result also holds for real systems having stochastic dynamics. This entails that ENMPC can be used as a new type of function approximator within RL. Furthermore, we investigate our results in the context of ENMPC and formally connect them to the concept of dissipativity, which is central for the ENMPC stability. Finally, we detail how these results can be used to deploy classic RL tools for tuning (E)NMPC schemes. We apply these tools on both, a classical linear MPC setting and a standard nonlinear example, from the ENMPC literature.
科研通智能强力驱动
Strongly Powered by AbleSci AI