强化学习
计算机科学
稳健性(进化)
分布式计算
图形
趋同(经济学)
操作员(生物学)
网络拓扑
人工智能
理论计算机科学
计算机网络
基因
转录因子
抑制因子
经济
化学
生物化学
经济增长
作者
Zhiqiang Pu,Huimu Wang,Zhen Liu,Jianqiang Yi,Shiguang Wu
出处
期刊:IEEE transactions on neural networks and learning systems
[Institute of Electrical and Electronics Engineers]
日期:2022-01-01
卷期号:: 1-15
被引量:4
标识
DOI:10.1109/tnnls.2022.3146858
摘要
In this article, a novel method, called attention enhanced reinforcement learning (AERL), is proposed to address issues including complex interaction, limited communication range, and time-varying communication topology for multi agent cooperation. AERL includes a communication enhanced network (CEN), a graph spatiotemporal long short-term memory network (GST-LSTM), and parameters sharing multi-pseudo critic proximal policy optimization (PS-MPC-PPO). Specifically, CEN based on graph attention mechanism is designed to enlarge the agents' communication range and to deal with complex interaction among the agents. GST-LSTM, which replaces the standard fully connected (FC) operator in LSTM with graph attention operator, is designed to capture the temporal dependence while maintaining the spatial structure learned by CEN. PS-MPC-PPO, which extends proximal policy optimization (PPO) in multi agent systems with parameters' sharing to scale to environments with a large number of agents in training, is designed with multi-pseudo critics to mitigate the bias problem in training and accelerate the convergence process. Simulation results for three groups of representative scenarios including formation control, group containment, and predator-prey games demonstrate the effectiveness and robustness of AERL.
科研通智能强力驱动
Strongly Powered by AbleSci AI