植绒(纹理)
强化学习
钢筋
计算机科学
人工智能
心理学
社会心理学
艺术
视觉艺术
作者
Yunxiao Guo,Dan Xu,Chang Wang,L Tan,Shufeng Shi,Wanchao Zhang,Xiaohui Sun,Han Long
出处
期刊:Lecture notes in electrical engineering
日期:2024-01-01
卷期号:: 1-14
标识
DOI:10.1007/978-981-97-1087-4_1
摘要
Deep reinforcement learning has been applied to the control of flocking tasks for fixed-wing Unmanned Aerial Vehicles (UAVs) with successful results. However, previous research has given less attention to the design of flocking rewards, and the underlying mechanism of these rewards remains unclear. In this paper, we analyze the underlying mechanism of the flocking reward, and propose the leader-guided C-S reward to guide the fixed-wing UAV flock in a leader-follower structure, and prove that it is bounded when time is limited, which can avoid the gradient exploding problem. Additionally, we propose a collision-free fixed-wing UAV flocking system that uses multi-agent deep deterministic policy gradient to alleviate the non-stationary environment. The proposed system is simulate in 3 and 6 follower scenarios, and the results validate that it effectively controls UAV flocking.
科研通智能强力驱动
Strongly Powered by AbleSci AI