强化学习
对抗制
计算机科学
任务(项目管理)
趋同(经济学)
人工智能
前提
控制(管理)
机器学习
工程类
语言学
经济增长
哲学
经济
系统工程
作者
Tianze Zhang,Xuhong Miao,Yibin Li,Lei Jia,Yinghao Zhuang
标识
DOI:10.1109/tc.2021.3072072
摘要
In this study, we consider surfacing control problems for the autonomous underwater vehicle in three-dimensional space under emergencies. Most model-based controllers cannot effectively solve these problems due to the unknown task environment model. Moreover, existing deep reinforcement learning (DRL) methods have some limitations, such as slow convergence or overestimation bias. For these purposes, we propose a model-free DRL algorithm based on the Deep Deterministic Policy Gradient within the paradigm of deep learning as a service (DLaaS). The algorithm combines existing expert episodes as demonstration and transitions from the interaction between the agent and task environment, which potentially reduce risk from adversarial attack. We propose a variable delay learning mechanism to improve the performance of the proposed algorithm. The simulation results show that our method can converge faster and has a more robust performance with adversarial attack than the existing DRL method under the premise of solving the AUV surfacing control tasks for safety.
科研通智能强力驱动
Strongly Powered by AbleSci AI