强化学习
计算机科学
透视图(图形)
功能(生物学)
控制(管理)
人工智能
机器学习
进化生物学
生物
作者
Yixu He,Yang Liu,Lan Yang,Xiaobo Qu
标识
DOI:10.1080/19427867.2024.2305018
摘要
The application of deep reinforcement learning (DRL) techniques in intelligent transportation systems garners significant attention. In this field, reward function design is a crucial factor for DRL performance. Current research predominantly relies on a trial-and-error approach for designing reward functions, lacking mathematical support and necessitating extensive empirical experimentation. Our research uses vehicle velocity control as a case study, build training and test sets, and develop a DRL framework for speed control. This framework examines both single-objective and multi-objective optimization in reward function designs. In single-objective optimization, we introduce "expected optimal velocity" as an optimization objective and analyze how different reward functions affect performance, providing a mathematical perspective on optimizing reward functions. In multi-objective optimization, we propose a reward function design paradigm and validate its effectiveness. Our findings offer a versatile framework and theoretical guidance for developing and optimizing reward functions in DRL, particularly for intelligent transportation systems.
科研通智能强力驱动
Strongly Powered by AbleSci AI