强化学习
加速度
导弹
计算机科学
视线
比例导航
航程(航空)
推力
控制理论(社会学)
导弹制导
直线(几何图形)
末制导
人工智能
模拟
工程类
航空航天工程
数学
控制(管理)
物理
几何学
经典力学
作者
Brian Gaudet,Roberto Furfaro,Richard Linares
标识
DOI:10.1016/j.ast.2020.105746
摘要
We present a novel guidance law that uses observations consisting solely of seeker line-of-sight angle measurements and their rate of change. The policy is optimized using reinforcement meta-learning and demonstrated in a simulated terminal phase of a mid-course exo-atmospheric interception. Importantly, the guidance law does not require range estimation, making it particularly suitable for passive seekers. The optimized policy maps stabilized seeker line-of-sight angles and their rate of change directly to commanded thrust for the missile's divert thrusters. Optimization with reinforcement meta-learning allows the optimized policy to adapt to target acceleration, and we demonstrate that the policy performs better than augmented zero-effort miss guidance with perfect target acceleration knowledge. The optimized policy is computationally efficient and requires minimal memory, and should be compatible with today's flight processors.
科研通智能强力驱动
Strongly Powered by AbleSci AI