强化学习
计算机科学
人工智能
机器学习
任务(项目管理)
卷积神经网络
代表(政治)
理论(学习稳定性)
光学(聚焦)
过程(计算)
特征学习
多任务学习
深度学习
特征(语言学)
物理
政治学
法学
管理
经济
哲学
光学
语言学
操作系统
政治
作者
Dayang Liang,Qihang Chen,Yunlong Liu
标识
DOI:10.1016/j.knosys.2021.107535
摘要
Deep reinforcement learning (DRL) has achieved great success in recent years by combining the feature extraction power of deep learning and the decision power of reinforcement learning techniques. In the literature, Convolutional Neural Networks (CNN) is usually used as the feature extraction method and recent studies have shown that the performances of the DRL algorithms can be greatly improved with the utilization of the attention mechanism, where the raw attentions are directly used for the decision-making. However, as is well-known, reinforcement learning is a trial-and-error process and it is almost impossible to learn an optimal policy in the beginning of the learning, especially in environments with sparse rewards, which in turn will cause the raw attention-based models can only remember and utilize the attention information indiscriminately for different areas and may focus on some task-irrelevant regions, but the focusing on such task-irrelevant regions is usually helpless and ineffective for the agent to find the optimal policy. To address this issue, we propose a gated multi-attention mechanism, which is then combined with the Deep Q-learning network (GMAQN). The gated multi-attention representation module (GMA) in GMAQN can effectively eliminate task-irrelevant attention information in the early phase of the trial-and-error process and improve the stability of the model. The proposed method has been demonstrated on the challenging domain of classic Atari 2600 games and experimental results show that compared with the baselines, our method can achieve better performance in terms of both the scores and the effect of focusing in the key regions.
科研通智能强力驱动
Strongly Powered by AbleSci AI