强化学习
计算机科学
钢筋
国家(计算机科学)
适应(眼睛)
错误驱动学习
人工智能
心理学
神经科学
社会心理学
算法
作者
Yuxiang Zhang,Xiaoling Liang,Dongyu Li,Shuzhi Sam Ge,Bingzhao Gao,Hong Chen,Tong Heng Lee
出处
期刊:IEEE transactions on cybernetics
[Institute of Electrical and Electronics Engineers]
日期:2023-06-26
卷期号:54 (3): 1907-1920
被引量:10
标识
DOI:10.1109/tcyb.2023.3283771
摘要
High-performance learning-based control for the typical safety-critical autonomous vehicles invariably requires that the full-state variables are constrained within the safety region even during the learning process. To solve this technically critical and challenging problem, this work proposes an adaptive safe reinforcement learning (RL) algorithm that invokes innovative safety-related RL methods with the consideration of constraining the full-state variables within the safety region with adaptation. These are developed toward assuring the attainment of the specified requirements on the full-state variables with two notable aspects. First, thus, an appropriately optimized backstepping technique and the asymmetric barrier Lyapunov function (BLF) methodology are used to establish the safe learning framework to ensure system full-state constraints requirements. More specifically, each subsystem's control and partial derivative of the value function are decomposed with asymmetric BLF-related items and an independent learning part. Then, the independent learning part is updated to solve the Hamilton–Jacobi–Bellman equation through an adaptive learning implementation to attain the desired performance in system control. Second, with further Lyapunov-based analysis, it is demonstrated that safety performance is effectively doubly assured via a methodology of a constrained adaptation algorithm during optimization (which incorporates the projection operator and can deal with the conflict between safety and optimization). Therefore, this algorithm optimizes system control and ensures that the full set of state variables involved is always constrained within the safety region during the whole learning process. Comparison simulations and ablation studies are carried out on motion control problems for autonomous vehicles, which have verified superior performance with smaller variance and better convergence performance under uncertain circumstances. The effectiveness of the safe performance of overall system control with the proposed method accordingly has been verified.
科研通智能强力驱动
Strongly Powered by AbleSci AI