强化学习
纳什均衡
计算机科学
增强学习
数学优化
最佳反应
灵活性(工程)
动作(物理)
人工智能
数理经济学
数学
量子力学
统计
物理
作者
Philippe Casgrain,Brian Ning,Sebastian Jaimungal
标识
DOI:10.1080/1350486x.2022.2136727
摘要
Model-free learning for multi-agent stochastic games is an active area of research. Existing reinforcement learning algorithms, however, are often restricted to zero-sum games and are applicable only in small state-action spaces or other simplified settings. Here, we develop a new data-efficient Deep-Q-learning methodology for model-free learning of Nash equilibria for general-sum stochastic games. The algorithm uses a locally linear-quadratic expansion of the stochastic game, which leads to analytically solvable optimal actions. The expansion is parametrized by deep neural networks to give it sufficient flexibility to learn the environment without the need to experience all state-action pairs. We study symmetry properties of the algorithm stemming from label-invariant stochastic games and as a proof of concept, apply our algorithm to learning optimal trading strategies in competitive electronic markets.
科研通智能强力驱动
Strongly Powered by AbleSci AI