代数Riccati方程
强化学习
离散时间和连续时间
随机逼近
数学
参数化复杂度
随机控制
数学优化
趋同(经济学)
网络数据包
贝尔曼方程
最优控制
控制理论(社会学)
马尔可夫决策过程
计算机科学
马尔可夫过程
Riccati方程
控制(管理)
算法
微分方程
数学分析
统计
人工智能
计算机网络
计算机安全
钥匙(锁)
经济增长
经济
作者
Yi Jiang,Weinan Gao,Ci Chen,Tianyou Chai,Frank L. Lewis
出处
期刊:Siam Journal on Control and Optimization
[Society for Industrial and Applied Mathematics]
日期:2023-10-24
卷期号:61 (5): 3183-3208
被引量:4
摘要
.This paper investigates the adaptive optimal control problem and proposes fundamentally novel non-model-based approaches for linear discrete-time networked control systems (NCSs) with both sensor and actuator two-channel stochastic dropouts by using directly the data transmitted via communication networks. First, we formulate a modified algebraic Riccati equation parameterized by the system dynamics and the network-induced packet dropouts probabilities, whose solvability is related to a critical arrival probability. To deal with this problem, two model-based reinforcement learning algorithms, policy iteration (PI) and value iteration (VI), are designed with their convergence proofs. To enable the application for NCSs with unknown system dynamics, two novel online PI and VI algorithms are designed. These algorithms develop a new theoretical framework to solve the Bellman function with stochastic dropouts by using directly the data transmitted via networks. Furthermore, a bilevel learning algorithm is proposed to approximate the critical arrival probability. Last but not least, an extension of the developed online VI algorithm is presented for stochastic systems with both unmeasurable noises and stochastic dropouts.Keywordsreinforcement learningadaptive optimal controlmodified algebraic Riccati equationcommunication networksMSC codes65K0565P9968W2570L99
科研通智能强力驱动
Strongly Powered by AbleSci AI