Deep Reinforcement Learning in Nonstationary Environments With Unknown Change Points

强化学习 计算机科学 稳健性(进化) 机器人 人工智能 无人机 理论(学习稳定性) 过程(计算) 变更检测 机器学习 生物化学 化学 遗传学 生物 基因 操作系统
作者
Z. Q. Liu,Jie Lü,Junyu Xuan,Guangquan Zhang
出处
期刊:IEEE transactions on cybernetics [Institute of Electrical and Electronics Engineers]
卷期号:54 (9): 5191-5204 被引量:3
标识
DOI:10.1109/tcyb.2024.3356981
摘要

Deep reinforcement learning (DRL) is a powerful tool for learning from interactions within a stationary environment where state transition and reward distributions remain constant throughout the process. Addressing the practical but challenging nonstationary environments with time-varying state transition or reward function changes during the interactions, ingenious solutions are essential for the stability and robustness of DRL agents. A key assumption to cope with nonstationary environments is that the change points between the previous and the new environments are known beforehand. Unfortunately, this assumption is impractical in many cases, such as outdoor robots and online recommendations. To address this problem, this article presents a robust DRL algorithm for nonstationary environments with unknown change points. The algorithm actively detects change points by monitoring the joint distribution of states and actions. A detection boosted, gradient-constrained optimization method then adapts the training of the current policy with the supporting knowledge of formerly well-trained policies. The previous policies and experience help the current policy adapt rapidly to environmental changes. Experiments show that the proposed method accumulates the highest reward among several alternatives and is the fastest to adapt to new environments. This work has compelling potential for increasing the environmental suitability of intelligent agents, such as drones, autonomous vehicles, and underwater robots.

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
机灵哲瀚完成签到,获得积分10
1秒前
蔚蓝天空完成签到 ,获得积分10
1秒前
柏林发布了新的文献求助10
2秒前
慕青应助乐观的颦采纳,获得10
2秒前
科研通AI5应助幸福大白采纳,获得10
3秒前
搜集达人应助幸福大白采纳,获得10
3秒前
3秒前
3秒前
3秒前
1412应助科研通管家采纳,获得10
4秒前
良辰应助科研通管家采纳,获得10
4秒前
今后应助科研通管家采纳,获得10
4秒前
Kevin应助科研通管家采纳,获得20
4秒前
ATLI应助科研通管家采纳,获得20
4秒前
我是老大应助科研通管家采纳,获得10
4秒前
Jasper应助科研通管家采纳,获得10
4秒前
良辰应助科研通管家采纳,获得10
4秒前
上官若男应助科研通管家采纳,获得10
4秒前
1412应助科研通管家采纳,获得10
4秒前
科研通AI5应助科研通管家采纳,获得10
4秒前
英姑应助科研通管家采纳,获得10
4秒前
Priscilla应助科研通管家采纳,获得10
5秒前
Akim应助科研通管家采纳,获得10
5秒前
完美世界应助科研通管家采纳,获得10
5秒前
良辰应助科研通管家采纳,获得10
5秒前
5秒前
小二郎应助科研通管家采纳,获得10
5秒前
bkagyin应助科研通管家采纳,获得10
5秒前
5秒前
6秒前
动漫大师发布了新的文献求助10
7秒前
7秒前
爆米花应助俊逸的凡柔采纳,获得10
8秒前
医痞子发布了新的文献求助10
11秒前
ssnha完成签到 ,获得积分10
12秒前
DY完成签到,获得积分10
13秒前
动漫大师发布了新的文献求助10
13秒前
SPLjoker完成签到 ,获得积分10
14秒前
14秒前
研友_Z7Xdl8发布了新的文献求助10
17秒前
高分求助中
Production Logging: Theoretical and Interpretive Elements 2700
Ophthalmic Equipment Market 1500
Neuromuscular and Electrodiagnostic Medicine Board Review 1000
こんなに痛いのにどうして「なんでもない」と医者にいわれてしまうのでしょうか 510
いちばんやさしい生化学 500
Genre and Graduate-Level Research Writing 500
The First Nuclear Era: The Life and Times of a Technological Fixer 500
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3673942
求助须知:如何正确求助?哪些是违规求助? 3229353
关于积分的说明 9785517
捐赠科研通 2939954
什么是DOI,文献DOI怎么找? 1611513
邀请新用户注册赠送积分活动 760978
科研通“疑难数据库(出版商)”最低求助积分说明 736344