A Lightweight Reinforcement-Learning-Based Real-Time Path-Planning Method for Unmanned Aerial Vehicles

强化学习计算机科学稳健性（进化）运动规划人工智能适应（眼睛）实时计算分布式计算机器学习机器人生物化学光学化学物理基因

作者

Meng Xi,Huiao Dai,Jingyi He,Wenjie Li,Jiabao Wen,Shuai Xiao,Jiachen Yang

出处

期刊：IEEE Internet of Things Journal [Institute of Electrical and Electronics Engineers]
日期：2024-01-05 卷期号：11 (12): 21061-21071 被引量：24

标识

DOI：10.1109/jiot.2024.3350525

摘要

The Unmanned Aerial Vehicles (UAVs) are competent to perform a variety of applications, possessing great potential and promise. The Deep Neural Network (DNN) technology has enabled the UAV-assisted paradigm, accelerated the construction of smart cities, and propelled the development of the Internet of Things (IoT). UAVs play an increasingly important role in various applications, such as surveillance, environmental monitoring, emergency rescue, supplies delivery, for which a robust path planning technique is the foundation and prerequisite. However, existing methods lack comprehensive consideration of the complicated urban environment and do not provide an overall assessment of the robustness and generalization. Meanwhile, due to the resource constraints and hardware limitations of UAVs, the complexity of deploying the network needs to be reduced. This paper proposes a lightweight, reinforcement learning-based real-time path planning method for UAVs, Adaptive Soft Actor-Critic algorithm (ASAC), which optimizing training process, network architecture, and algorithmic models. First of all, we establish a framework of global training and local adaptation, where the structured environment model is constructed for interaction, and local dynamically varying information aids in improving generalization. Secondly, ASAC introduces a cross-layer connection approach that passes the original state information into the higher layers to avoid feature loss and improve learning efficiency. Finally, we propose an adaptive temperature coefficient, which flexibly adjusts the exploration probability of UAVs with the training phase and experience data accumulation. In addition, a series of comparison experiments have been conducted in conjunction with practical application requirements, and the results have fully proved the favorable superiority of ASAC.

求助该文献

最长约 10秒，即可获得该文献文件

A Lightweight Reinforcement-Learning-Based Real-Time Path-Planning Method for Unmanned Aerial Vehicles

今日热心研友