强化学习
马尔可夫决策过程
计算机科学
干扰
人工智能
雷达
部分可观测马尔可夫决策过程
电子战
机器学习
过程(计算)
决策支持系统
运筹学
马尔可夫过程
马尔可夫链
马尔可夫模型
工程类
电信
物理
热力学
统计
数学
操作系统
作者
Wen Jiang,Yihui Ren,Yanping Wang
标识
DOI:10.1016/j.dsp.2023.103952
摘要
Most of the existing anti-jamming decision-making methods overly rely on the subjective experience of radar operators. However, due to the rapid development of cognitive radar and modern electronic warfare, conventional anti-jamming decision-making methods can no longer adapt to the complex and changing electromagnetic environment. The advent of deep reinforcement learning (DRL) provides a new attractive solution for this issue. In this paper, an adversarial anti-jamming decision-making network for cognitive radar via multi-agent deep reinforcement learning (MDRL) is proposed, which has good self-learning ability and can meet the requirements of intelligent, dynamic and real-time in modern electronic warfare. Since competitive decision-makers are considered and these two confrontational sides are not able to obtain the completely accurate information of each other, the environment model is specifically constructed as a partially observable Markov decision process (POMDP). Then, a decision-making network is designed based on deterministic deep deterministic policy gradient (DDPG) algorithm to explore the competition between cognitive radar and smart jammer. In order to overcome the environment non-stationarity, the decision-making network is trained and tested in a special MDRL framework. The experimental results demonstrate that the proposed method is effective in anti-jamming decision-making system of cognitive radar. Furthermore, the two confrontational sides show high decision-making ability and perform well in the adversarial scenario by comparing with other training policies, which demonstrate that confrontational training with powerful opponents can improve the intelligence level of all agents.
科研通智能强力驱动
Strongly Powered by AbleSci AI