Deep Q-Network Based Target Search of UAV under Partially Observable Conditions
可见的
计算机科学
人工智能
物理
量子力学
作者
Sheng Nan Jiang,Sheng Wang,Subing Huang
标识
DOI:10.1109/icsece58870.2023.10263279
摘要
Autonomous target search for unmanned aerial vehicles (UAVs) has broad application scenarios in both military and civilian domains. In this paper, In this paper, a deep reinforcement learning-based autonomous target search method for the USV is proposed which incorporates the recurrent neural network to achieve accurate decision-making under partially observable conditions. In our study, firstly, we model the UAV's target search task as a partially observable Markov decision process problem and use the Deep Q-Network(DQN) to evaluate the current state and generate actions. Secondly, the gated recurrent unit (GRU) is introduced to mitigate the decision bias that occurs in the UAV under incompletely observable conditions and to achieve a mapping from incomplete to more complete perceptual information of the UAV at the current moment. Finally, to avoid the problem of slow convergence of the algorithm caused by sparse rewards, we combine prior knowledge to design reward functions suitable for the UAV target tracking task to facilitate faster convergence of the algorithm. To verify the effectiveness of the algorithm, we designed an experimental environment for UAV target search based on a 3D simulation platform. The experimental results show that our method enables the UAV to successfully search for targets in less time for the same training episodes.