Domain adaptive state representation alignment for reinforcement learning

强化学习 计算机科学 人工智能 不变(物理) 对抗制 代表(政治) 领域(数学分析) 一般化 特征学习 机器学习
作者
Dongfen Li,Lichao Meng,Jingjing Li,Ke Lu,Yang Yang
出处
期刊:Information Sciences [Elsevier BV]
卷期号:609: 1353-1368
标识
DOI:10.1016/j.ins.2022.07.156
摘要

In recent years, deep reinforcement learning (RL) has shown excellent performance in robot control, video games, and multi-agent systems. However, most of existing RL models do not generalize. Even a small visual change will greatly deteriorate the performance of RL agents, which limits the generalization and flexibility of RL in real-world applications. To address this problem, we propose a two-stage model in which reinforcement learning agents learn adaptation to changes in the visual environment before learning optimal behavioral policies. In the first stage, we employ domain adaptation to align the distribution of domain-invariant state representations from different domains in the latent feature space. Specifically, we introduce feature-level and pixel-level multi-granularity adversarial loss to constrain the learning of domain-invariant state representations. In the second stage, the RL agent is trained based on the learned domain-invariant state representations. Since the adjusted observation is domain-invariant, the learned policy has strong cross-domain generalization performance. We name the proposed method as Adversarial-based Domain Invariant State Representation (Ad-DISR). At last, we evaluate Ad-DISR on various variants of Car-Racing games and CARLA, an autonomous driving simulator. The results show that our method can achieve better performance on both reward scores and living time in both source and target domains.

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
amberzyc发布了新的文献求助70
刚刚
小滑头完成签到,获得积分10
刚刚
阿拉香香完成签到,获得积分10
1秒前
1秒前
生动项链完成签到,获得积分20
1秒前
1秒前
1秒前
希望天下0贩的0应助Lileyson采纳,获得20
2秒前
谷德存发布了新的文献求助10
2秒前
852应助南无双采纳,获得10
2秒前
大胆翎发布了新的文献求助20
2秒前
Jeff完成签到,获得积分10
2秒前
洒脱完成签到,获得积分10
3秒前
3秒前
3秒前
欣慰冬瓜完成签到,获得积分10
3秒前
Kk发布了新的文献求助20
3秒前
4秒前
烟花应助丰富的小白菜采纳,获得10
4秒前
慕青应助Zhou采纳,获得10
4秒前
腼腆的绝山完成签到,获得积分20
4秒前
小祝完成签到,获得积分20
5秒前
Yuan发布了新的文献求助10
5秒前
Yoki发布了新的文献求助10
5秒前
6秒前
思源应助zhang采纳,获得150
6秒前
kim关注了科研通微信公众号
6秒前
走之发布了新的文献求助10
6秒前
fwz发布了新的文献求助10
7秒前
料尾发布了新的文献求助10
7秒前
8秒前
8秒前
8秒前
左手骑车完成签到,获得积分10
9秒前
自觉迎夏应助Grant采纳,获得10
9秒前
月夙发布了新的文献求助10
9秒前
大气沛容完成签到,获得积分10
9秒前
李新珂发布了新的文献求助10
10秒前
10秒前
嘟嘟发布了新的文献求助20
10秒前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
The Organometallic Chemistry of the Transition Metals 800
Chemistry and Physics of Carbon Volume 18 800
The Organometallic Chemistry of the Transition Metals 800
Leading Academic-Practice Partnerships in Nursing and Healthcare: A Paradigm for Change 800
The formation of Australian attitudes towards China, 1918-1941 640
Signals, Systems, and Signal Processing 610
热门求助领域 (近24小时)
化学 材料科学 医学 生物 纳米技术 工程类 有机化学 化学工程 生物化学 计算机科学 物理 内科学 复合材料 催化作用 物理化学 光电子学 电极 细胞生物学 基因 无机化学
热门帖子
关注 科研通微信公众号,转发送积分 6438589
求助须知:如何正确求助?哪些是违规求助? 8252698
关于积分的说明 17562163
捐赠科研通 5496905
什么是DOI,文献DOI怎么找? 2898997
邀请新用户注册赠送积分活动 1875691
关于科研通互助平台的介绍 1716489