跳频扩频
干扰
计算机科学
马尔可夫决策过程
收敛速度
通信系统
马尔可夫过程
算法
数学优化
电信
数学
统计
热力学
物理
频道(广播)
作者
Y. Zhang,Zhijin Zhao,Shilian Zheng,Fangfang Qiang
出处
期刊:IEEE Transactions on Cognitive Communications and Networking
[Institute of Electrical and Electronics Engineers]
日期:2023-12-01
卷期号:9 (6): 1579-1595
标识
DOI:10.1109/tccn.2023.3306363
摘要
The conventional frequency hopping (FH) system is susceptible to malicious jamming due to the prearranged hopping frequency table. In this paper, we develop a bivariate frequency agility (BFA) communication system to improve the anti-jamming capability by assigning time-varying characteristics to the communication parameters such as fixed frequency interval and hopping rate in conventional FH. Our goal is to find the optimal frequency interval and hopping rate strategy in jamming environment to maximize the signal-to-noise ratio (SINR). We formulate the parameter decision problem as a Markov decision process (MDP). Then, we propose a deep deterministic policy gradient (DDPG) based algorithm for frequency interval selection and hopping rate setting. In addition, to overcome the shortcomings of DDPG, which is prone to fall into local optimum and unstable convergence, an improved deep deterministic policy gradient algorithm with a weighted dual-prioritized experience replay and periodically updated learning rate (IDDPG) is proposed. In IDDPG, on the one hand, the model is trained by replaying more experiences with high immediate reward and large temporal difference error (TD error) to make it more accurate. On the other hand, the learning rate is periodically decayed so that the update rate of the network model varies periodically, resulting in a richer and more diverse exploration. The simulation results under different electromagnetic jamming environment indicates that the anti-jamming performance of the proposed two algorithms outperforms that of the PPER-DQN algorithm and the RFH algorithm.
科研通智能强力驱动
Strongly Powered by AbleSci AI