强化学习
计算机科学
任务(项目管理)
一般化
人工智能
趋同(经济学)
功能(生物学)
代表(政治)
贝尔曼方程
机器学习
数学优化
政治
数学分析
生物
经济
进化生物学
管理
法学
经济增长
数学
政治学
作者
Xiaolu Hou,Zhenyang Guo,Xuan Wang,Tao Qian,Jiajia Zhang,Shuhan Qi,Jing Xiao
标识
DOI:10.1016/j.knosys.2021.107753
摘要
Traditional reinforcement learning methods are only applicable to single-scenario tasks. When it comes to multi-scenario, the single-scenario agents fail to perform well. That is, the traditional reinforcement learning methods own the poor generalization when facing different tasks simultaneously. In this work, we propose a practical deep reinforcement learning framework that can perform on multiple 3D scenarios concurrently. We adopt the Actor–Learner framework to realize the parallelization of multiple scenarios and resolve the policy lag problem by generalizing Retrace(λ) to a new value function. We prove its convergence theoretically. Besides, we design an auxiliary recognition task and an auxiliary control task inspired by the hard shared representation in multi-task learning to improve the performance of our multi-scenario agent. Experimental results show that our method outperforms state-of-the-art algorithms on DMLab-30, achieving more advantages on multi-scenario games. We verify the effectiveness of each part of our framework by the ablation experiments. We also find our parallel learner transferable by testing on the untrained scenarios.
科研通智能强力驱动
Strongly Powered by AbleSci AI