强化学习
计算机科学
水准点(测量)
源代码
计算
数学优化
人口
熵(时间箭头)
算法
人工智能
数学
物理
操作系统
地理
人口学
社会学
量子力学
大地测量学
作者
Hieu Trung Nguyen,Khang Tran,Ngoc Hoang Luong
标识
DOI:10.1109/nics54270.2021.9701549
摘要
Hybridizations of Deep Reinforcement Learning (DRL) and Evolution Computation (EC) methods have recently showed considerable successes in a variety of high dimensional physical control tasks. These hybrid frameworks offer more robust mechanisms of exploration and exploitation in the policy network parameter search space when stabilizing gradient-based updates of DRL algorithms with population-based operations adopted from EC methods. In this paper, we propose a novel hybrid framework that effectively combines the efficiency of DRL updates and the stability of EC populations. We experiment with integrating the Twin Delayed Deep Deterministic Policy Gradient (TD3) and the Cross-Entropy Method (CEM). The resulting EC-enhanced TD3 algorithm (eTD3) are compared with the baseline algorithm TD3 and a state-of-the-art evolutionary reinforcement learning (ERL) method, CEM-TD3. Experimental results on five MuJoCo continuous control benchmark environments confirm the efficacy of our approach. The source code of the paper is available at https://github.com/ELO-Lab/eTD3.
科研通智能强力驱动
Strongly Powered by AbleSci AI