强化学习
计算机科学
人工智能
不变(物理)
对抗制
代表(政治)
领域(数学分析)
一般化
特征学习
机器学习
作者
Dongfen Li,Lichao Meng,Jingjing Li,Ke Lu,Yang Yang
标识
DOI:10.1016/j.ins.2022.07.156
摘要
In recent years, deep reinforcement learning (RL) has shown excellent performance in robot control, video games, and multi-agent systems. However, most of existing RL models do not generalize. Even a small visual change will greatly deteriorate the performance of RL agents, which limits the generalization and flexibility of RL in real-world applications. To address this problem, we propose a two-stage model in which reinforcement learning agents learn adaptation to changes in the visual environment before learning optimal behavioral policies. In the first stage, we employ domain adaptation to align the distribution of domain-invariant state representations from different domains in the latent feature space. Specifically, we introduce feature-level and pixel-level multi-granularity adversarial loss to constrain the learning of domain-invariant state representations. In the second stage, the RL agent is trained based on the learned domain-invariant state representations. Since the adjusted observation is domain-invariant, the learned policy has strong cross-domain generalization performance. We name the proposed method as Adversarial-based Domain Invariant State Representation (Ad-DISR). At last, we evaluate Ad-DISR on various variants of Car-Racing games and CARLA, an autonomous driving simulator. The results show that our method can achieve better performance on both reward scores and living time in both source and target domains.
科研通智能强力驱动
Strongly Powered by AbleSci AI