追逃
追求者
计算机科学
强化学习
地球同步轨道
帧(网络)
数学优化
过程(计算)
轨道(动力学)
算法
控制理论(社会学)
人工智能
数学
控制(管理)
航空航天工程
工程类
操作系统
电信
卫星
作者
Liran Zhao,Yulin Zhang,Zhaohui Dang
标识
DOI:10.1016/j.asr.2023.03.014
摘要
This paper comprehensively investigates the problem of impulsive orbital pursuit-evasion games (OPEGs) by using an artificial intelligence-based approach. First, the mathematical model for the impulsive OPEGs in which the pursuer and evader both perform their orbital maneuvers by imposing the impulsive velocity increments is constructed. Second, the problem of impulsive OPEGs is transformed into a bilateral optimization problem with a minimum–maximum optimization index in terms of terminal time and multiple constraints such as maneuverability, total fuel consumption, and mission time, etc. To determine the optimal impulsive maneuvers for both sides, a PRD-MADDPG (Predict-Reward-Detect Multi-Agent Deep Deterministic Policy Gradient) algorithm in the frame of multi-agent reinforcement learning is designed. This novel algorithm uses the basic MADDPG to achieve the strategies training and learning, and applies the supplemental PRD to predict the change of game state during the interval between two adjacent impulsive maneuvers and incorporate these information into the algorithm training in the form of predicted reward. Finally, some pursuit-evasion missions near the Geosynchronous Earth Orbit are numerically analyzed to verify the validness and effectiveness of the algorithm. The results prove that the PRD-MADDPG algorithm is very efficient to find applicable strategies even considering rather complex constraints. It is also shown that the learning-based strategies can be effectively applied in the extended scenarios which are not seen in the training process.
科研通智能强力驱动
Strongly Powered by AbleSci AI