强化学习
计算机科学
适应(眼睛)
差异(会计)
人工智能
多智能体系统
动作(物理)
集合(抽象数据类型)
机器学习
经济
量子力学
光学
物理
会计
程序设计语言
作者
Ryan Lowe,Yi Wu,Aviv Tamar,Jean Harb,OpenAI Pieter Abbeel,Igor Mordatch
出处
期刊:Cornell University - arXiv
日期:2017-06-07
卷期号:30: 6379-6390
被引量:737
摘要
We explore deep reinforcement learning methods for multi-agent domains. We begin by analyzing the difficulty of traditional algorithms in the multi-agent case: Q-learning is challenged by an inherent non-stationarity of the environment, while policy gradient suffers from a variance that increases as the number of agents grows. We then present an adaptation of actor-critic methods that considers action policies of other agents and is able to successfully learn policies that require complex multi-agent coordination. Additionally, we introduce a training regimen utilizing an ensemble of policies for each agent that leads to more robust multi-agent policies. We show the strength of our approach compared to existing methods in cooperative as well as competitive scenarios, where agent populations are able to discover various physical and informational coordination strategies.
科研通智能强力驱动
Strongly Powered by AbleSci AI