强化学习
计算机科学
运动规划
端到端原则
集合(抽象数据类型)
人工智能
避障
过程(计算)
理论(学习稳定性)
事后诸葛亮
任务(项目管理)
动作(物理)
运动(物理)
机器人
机器学习
移动机器人
工程类
心理学
物理
系统工程
量子力学
程序设计语言
认知心理学
操作系统
作者
Xi Lyu,Yushan Sun,Lifeng Wang,Jiehui Tan,Liwen Zhang
摘要
This study aims to solve the problems of sparse reward, single policy, and poor environmental adaptability in the local motion planning task of autonomous underwater vehicles (AUVs). We propose a two-layer deep deterministic policy gradient algorithm-based end-to-end perception–planning–execution method to overcome the challenges associated with training and learning in end-to-end approaches that directly output control forces. In this approach, the state set is established based on the environment information, the action set is established based on the motion characteristics of the AUV, and the control execution force set is established based on the control constraints. The mapping relations between each set are trained using deep reinforcement learning, enabling the AUV to perform the corresponding action in the current state, thereby accomplishing tasks in an end-to-end manner. Furthermore, we introduce the hindsight experience replay (HER) method in the perception planning mapping process to enhance stability and sample efficiency during training. Finally, we conduct simulation experiments encompassing planning, execution, and end-to-end performance evaluation. Simulation training demonstrates that our proposed method exhibits improved decision-making capabilities and real-time obstacle avoidance during planning. Compared to global planning, the end-to-end algorithm comprehensively considers constraints in the AUV planning process, resulting in more realistic AUV actions that are gentler and more stable, leading to controlled tracking errors.
科研通智能强力驱动
Strongly Powered by AbleSci AI