计算机科学
运动规划
路径(计算)
人工智能
机器人
出处
期刊:IEEE International Conference on Real-time Computing and Robotics
日期:2020-09-28
卷期号:: 387-392
标识
DOI:10.1109/rcar49640.2020.9303290
摘要
An end-to-end approach based on the theory of Deep Reinforcement Learning has been proven to be able to meet or exceed human-level strategic capabilities. Applying this learning algorithm to path planning methods can make robots self-contained learning ability and environment interaction ability, and increased generalization ability. In this paper, Deep Q Network (DQN) as the typical Deep Reinforcement Learning method is improved. Improvement points can be divided into two steps. Firstly, the two steps of the selection of actions in the current network and how to calculate the target Q value are decoupled to eliminate overestimation caused by the rapid optimization of Q value in the possible direction. Then, considering that the action value function can bring benefits in addition to the action with the greatest value made by the agent, the static environment also has certain influence, the final result is a linear combination of two parts, which is to estimate the value functions of the upper, lower, left and right actions of the neural network output and the value of the environment state itself. Under the same experimental conditions, the improved DQN network is compared with the original DQN network, the result shows that the estimated final target value function of improved DQN network is more accurate and effective for virtual path planning tasks.
科研通智能强力驱动
Strongly Powered by AbleSci AI