计算机科学
随机性
强化学习
人工智能
避障
一般化
实时计算
障碍物
机器人
路径(计算)
移动机器人
模拟
运动规划
统计
数学分析
数学
程序设计语言
法学
政治学
作者
Sitong Zhang,Yibing Li,Qianhui Dong
标识
DOI:10.1016/j.asoc.2021.108194
摘要
Path planning is one of the most essential part in autonomous navigation. Most existing works suppose that the environment is static and fixed. However, path planning is widely used in random and dynamic environment (such as search and rescue, surveillance and other scenarios). In this paper, we propose a Deep Reinforcement Learning (DRL)-based method that enables unmanned aerial vehicles (UAVs) to execute navigation tasks in multi-obstacle environments with randomness and dynamics. The method is based on the Twin Delayed Deep Deterministic Policy Gradients (TD3) algorithm. In order to predict the impact of the environment on UAV, the change of environment observations is added into the Actor–Critic network input, and the two-stream Actor–Critic network structure is proposed to extract features of environment observations. Simulations are carried out to evaluate the performance of the algorithm and experiment results show that our method can enable the UAV to complete autonomous navigation tasks safely in multi-obstacle environments, which reflects the efficiency of our method. Moreover, compared to DDPG and the conventional TD3, our method has better generalization ability.
科研通智能强力驱动
Strongly Powered by AbleSci AI