强化学习
交叉口(航空)
计算机科学
信号(编程语言)
功能(生物学)
分歧(语言学)
多智能体系统
数学优化
人工智能
数学
工程类
语言学
进化生物学
生物
哲学
航空航天工程
程序设计语言
作者
Junxiu Liu,Sheng Qin,Min Su,Yuling Luo,Yanhu Wang,Su Yang
标识
DOI:10.1016/j.ins.2023.119484
摘要
For the multi-agent traffic signal controls, the traffic signal at each intersection is controlled by an independent agent. Since the control policy for each agent is dynamic, when the traffic scale is large, the adjustment of the agent's policy brings non-stationary effects over surrounding intersections, leading to the instability of the overall system. Therefore, there is the necessity to eliminate this non-stationarity effect to stabilize the multi-agent system. A collaborative multi-agent reinforcement learning method is proposed in this work to enable the system to overcome the instability problem through a collaborative mechanism. Decentralized learning with limited communication is used to reduce the communication latency between agents. The Shapley value reward function is applied to comprehensively calculate the contribution of each agent to avoid the influence of reward function coefficient variation, thereby reducing unstable factors. The Kullback-Leibler divergence is then used to distinguish the current and historical policies, and the loss function is optimized to eliminate the environmental non-stationarity. Experimental results demonstrate that the average travel time and its standard deviation are reduced by using the Shapley value reward function and optimized loss function, respectively, and this work provides an alternative for traffic signal controls on multiple intersections.
科研通智能强力驱动
Strongly Powered by AbleSci AI