强化学习
计算机科学
李雅普诺夫函数
理论(学习稳定性)
弹道
约束(计算机辅助设计)
功能(生物学)
运动规划
Lyapunov稳定性
国家(计算机科学)
控制理论(社会学)
控制(管理)
数学优化
人工智能
控制工程
机器学习
工程类
算法
数学
机器人
物理
生物
机械工程
进化生物学
非线性系统
量子力学
天文
作者
Lixian Zhang,Ruixian Zhang,Tong Wu,Rui Weng,Minghao Han,Ye Zhao
出处
期刊:IEEE transactions on neural networks and learning systems
[Institute of Electrical and Electronics Engineers]
日期:2021-07-09
卷期号:32 (12): 5435-5444
被引量:77
标识
DOI:10.1109/tnnls.2021.3084685
摘要
Reinforcement learning with safety constraints is promising for autonomous vehicles, of which various failures may result in disastrous losses. In general, a safe policy is trained by constrained optimization algorithms, in which the average constraint return as a function of states and actions should be lower than a predefined bound. However, most existing safe learning-based algorithms capture states via multiple high-precision sensors, which complicates the hardware systems and is power-consuming. This article is focused on safe motion planning with the stability guarantee for autonomous vehicles with limited size and power. To this end, the risk-identification method and the Lyapunov function are integrated with the well-known soft actor–critic (SAC) algorithm. By borrowing the concept of Lyapunov functions in the control theory, the learned policy can theoretically guarantee that the state trajectory always stays in a safe area. A novel risk-sensitive learning-based algorithm with the stability guarantee is proposed to train policies for the motion planning of autonomous vehicles. The learned policy is implemented on a differential drive vehicle in a simulation environment. The experimental results show that the proposed algorithm achieves a higher success rate than the SAC.
科研通智能强力驱动
Strongly Powered by AbleSci AI