强化学习
计算机科学
钢筋
人工智能
价值(数学)
增强学习
错误驱动学习
贝尔曼方程
机器学习
数学优化
工程类
数学
结构工程
作者
Yunting Liu,Yang Jia-ming,Liang Chen,Ting Guo,Yu Jiang
标识
DOI:10.1109/ccdc49329.2020.9164615
摘要
Reinforcement learning methods are mainly divided into two categories based on value functions and policies. This article systematically introduces and summarizes reinforcement learning methods from these two categories. First, it summarizes the reinforcement learning methods based on value functions, including classic Q-learning, DQN, and effective improvement methods based on DQN. Then it introduces policy-based reinforcement learning methods, including policy gradient, policy optimization, actor critic, and their improvements. Finally, the frontier research and applications of reinforcement learning is summarized.
科研通智能强力驱动
Strongly Powered by AbleSci AI