Attention Enhanced Reinforcement Learning for Multi agent Cooperation

强化学习 计算机科学 稳健性(进化) 分布式计算 图形 趋同(经济学) 操作员(生物学) 网络拓扑 人工智能 理论计算机科学 计算机网络 基因 转录因子 抑制因子 经济 化学 生物化学 经济增长
作者
Zhiqiang Pu,Huimu Wang,Zhen Liu,Jianqiang Yi,Shiguang Wu
出处
期刊:IEEE transactions on neural networks and learning systems [Institute of Electrical and Electronics Engineers]
卷期号:34 (11): 8235-8249 被引量:24
标识
DOI:10.1109/tnnls.2022.3146858
摘要

In this article, a novel method, called attention enhanced reinforcement learning (AERL), is proposed to address issues including complex interaction, limited communication range, and time-varying communication topology for multi agent cooperation. AERL includes a communication enhanced network (CEN), a graph spatiotemporal long short-term memory network (GST-LSTM), and parameters sharing multi-pseudo critic proximal policy optimization (PS-MPC-PPO). Specifically, CEN based on graph attention mechanism is designed to enlarge the agents' communication range and to deal with complex interaction among the agents. GST-LSTM, which replaces the standard fully connected (FC) operator in LSTM with graph attention operator, is designed to capture the temporal dependence while maintaining the spatial structure learned by CEN. PS-MPC-PPO, which extends proximal policy optimization (PPO) in multi agent systems with parameters' sharing to scale to environments with a large number of agents in training, is designed with multi-pseudo critics to mitigate the bias problem in training and accelerate the convergence process. Simulation results for three groups of representative scenarios including formation control, group containment, and predator-prey games demonstrate the effectiveness and robustness of AERL.

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
酷酷海秋发布了新的文献求助10
刚刚
Jzx发布了新的文献求助10
1秒前
李健应助luo采纳,获得10
1秒前
lv完成签到,获得积分10
1秒前
郑皓文完成签到,获得积分10
1秒前
1秒前
李开心发布了新的文献求助10
2秒前
所所应助晏瑜霜采纳,获得10
2秒前
火火完成签到 ,获得积分10
2秒前
2秒前
2秒前
小龙完成签到,获得积分10
2秒前
hhh发布了新的文献求助10
2秒前
Akim应助儒雅达采纳,获得10
3秒前
3秒前
4秒前
4秒前
5秒前
CodeCraft应助LilGee采纳,获得10
5秒前
wwxd发布了新的文献求助10
5秒前
LYW应助科研通管家采纳,获得10
5秒前
LYW应助科研通管家采纳,获得10
5秒前
5秒前
英俊的铭应助科研通管家采纳,获得10
5秒前
汉堡包应助科研通管家采纳,获得10
6秒前
善良冷雁应助科研通管家采纳,获得50
6秒前
酷波er应助科研通管家采纳,获得10
6秒前
orixero应助科研通管家采纳,获得10
6秒前
6秒前
依古比古应助科研通管家采纳,获得50
6秒前
传奇3应助科研通管家采纳,获得30
6秒前
6秒前
小蘑菇应助科研通管家采纳,获得20
6秒前
隐形曼青应助科研通管家采纳,获得10
6秒前
博ge发布了新的文献求助10
6秒前
平安完成签到 ,获得积分10
6秒前
陈雅玲发布了新的文献求助10
6秒前
ding应助科研通管家采纳,获得10
6秒前
小蘑菇应助科研通管家采纳,获得10
6秒前
Lucas应助科研通管家采纳,获得30
6秒前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
Introduction to Helicopter and Tiltrotor Flight Simulation, Second Edition 2500
卤化钙钛矿人工突触的研究 2000
Malcolm Fraser : a biography 700
Signals, Systems, and Signal Processing 610
Software that combines deep learning,3D reconstruction and CFD to analyze the state of carotid arteries from ultrasound imaging 600
Bounds for Statistical Estimation in Semiparametric Models 500
热门求助领域 (近24小时)
化学 材料科学 医学 生物 纳米技术 工程类 有机化学 化学工程 生物化学 计算机科学 物理 内科学 复合材料 催化作用 物理化学 光电子学 电极 细胞生物学 基因 无机化学
热门帖子
关注 科研通微信公众号,转发送积分 6499824
求助须知:如何正确求助?哪些是违规求助? 8295247
关于积分的说明 17702332
捐赠科研通 5596359
什么是DOI,文献DOI怎么找? 2918116
邀请新用户注册赠送积分活动 1895246
关于科研通互助平台的介绍 1756054