强化学习
钢筋
计算机科学
人工智能
心理学
认知心理学
社会心理学
作者
Sen Wang,Daoyuan Jia,Xinshuo Weng
出处
期刊:Cornell University - arXiv
日期:2018-01-01
被引量:63
标识
DOI:10.48550/arxiv.1811.11329
摘要
Reinforcement learning has steadily improved and outperform human in lots of traditional games since the resurgence of deep neural network. However, these success is not easy to be copied to autonomous driving because the state spaces in real world are extreme complex and action spaces are continuous and fine control is required. Moreover, the autonomous driving vehicles must also keep functional safety under the complex environments. To deal with these challenges, we first adopt the deep deterministic policy gradient (DDPG) algorithm, which has the capacity to handle complex state and action spaces in continuous domain. We then choose The Open Racing Car Simulator (TORCS) as our environment to avoid physical damage. Meanwhile, we select a set of appropriate sensor information from TORCS and design our own rewarder. In order to fit DDPG algorithm to TORCS, we design our network architecture for both actor and critic inside DDPG paradigm. To demonstrate the effectiveness of our model, We evaluate on different modes in TORCS and show both quantitative and qualitative results.
科研通智能强力驱动
Strongly Powered by AbleSci AI