强化学习
计算机科学
杠杆(统计)
稳健性(进化)
人工智能
制动器
任务(项目管理)
工程类
汽车工程
生物化学
基因
化学
系统工程
作者
Tianqi Wang,Dong Eui Chang
标识
DOI:10.23919/iccas47443.2019.8971737
摘要
We present a training pipeline for the autonomous driving task given the current camera image and vehicle speed as the input to produce the throttle, brake, and steering control output. The simulator Airsim's [1] convenient weather and lighting API provides a sufficient diversity during training which can be very helpful to increase the trained policy's robustness. In order to not limit the possible policy's performance, we use a continuous and deterministic control policy setting. We utilize ResNet-34 [2] as our actor and critic networks with some slight changes in the fully connected layers. Considering human's mastery of this task and the high-complexity nature of this task, we first use imitation learning to mimic the given human policy and then leverage the trained policy and its weights to the reinforcement learning phase for which we use DDPG [3]. This combination shows a considerable performance boost comparing to both pure imitation learning and pure DDPG for the autonomous driving task.
科研通智能强力驱动
Strongly Powered by AbleSci AI