计算机科学
强化学习
人工智能
机器学习
推论
弹道
人工神经网络
样品(材料)
化学
物理
色谱法
天文
作者
Jiangfeng Nan,Weiwen Deng,Ruzheng Zhang,Rui Zhao,Ying Wang,Juan Ding
标识
DOI:10.1109/tiv.2023.3335218
摘要
Modeling driving behavior plays a pivotal role in advancing the development of human-like autonomous driving. In light of this, this paper proposes a car-following behavior modeling method with sample-based deep inverse reinforcement learning (DIRL). Due to the challenges associated with feature extraction and the limited fitting capacity of linear functions, traditional IRL, which employs feature-based linear functions to represent reward functions, exhibits low modeling accuracy. Accordingly, DIRL leverages deep neural networks to represent reward functions. However, the requirement for reinforcement learning to determine the optimal policy for DIRL's reward function makes the training and inference processes computationally resource-intensive and inefficient. To address this issue, this paper proposes the sample-based DIRL. Through solution space discretization, sample-based DIRL streamlines the integration calculation of the partition function into a summation, resulting in improved computational efficiency. Specifically, it is a three-stage framework: sampling candidate trajectories, evaluating candidate trajectories, and selecting the trajectory with the highest reward. In order to evaluate DIRL at both the level of driving behavior and the reward function, the MPC-based virtual driver with the explicit reward function is utilized to collect driving data for training and assessing the convergence of the learned reward function. The experimental results confirm that the proposed method can accurately model the car-following behavior, and acquire the driver's reward function from the driving data.
科研通智能强力驱动
Strongly Powered by AbleSci AI