强化学习
计算机科学
信号(编程语言)
人工智能
交通信号灯
控制(管理)
钢筋
趋同(经济学)
自适应控制
机器学习
功能(生物学)
实时计算
工程类
结构工程
经济
程序设计语言
经济增长
进化生物学
生物
作者
Yanjiao Xu,Yuxin Wang,Chanjuan Liu
标识
DOI:10.1109/fcsit57414.2022.00022
摘要
Traffic signal control system affects the efficiency of transportation. Adaptive traffic signal control has attracted attention for automatically adjusting the phase according to different traffic conditions. Researchers have been trying to apply deep reinforcement learning to the design of adaptive traffic signal control. The meta-parameter like $\gamma$ , the discount rate of future reward, is crucial in reinforcement learning. Researchers, especially across disciplines, need to conduct extensive experiments to find appropriate values for the meta-parameters. Automated Reinforcement Learning (AutoRL) can automate the design choices of meta-parameters instead of manual tuning. The gradient-based meta-learning algorithm belongs to AutoRL. To save time in exploring meta-parameters, we integrate the gradient-based meta-learning algorithm into DQN (GBML-DQN). We conduct experiments on the traffic simulator SUMO. Our results show that GBML-DQN promotes convergence of the Q-value function and avoids overestimation to some extent, especially in the case of inappropriate training settings, whereas DQN fails in training.
科研通智能强力驱动
Strongly Powered by AbleSci AI