维数之咒
计算机科学
机器学习
人工智能
动作(物理)
算法
数学
物理
量子力学
作者
Sheng Wang,Long Zhang,Yang Liu,Zhen Wang
标识
DOI:10.1098/rspa.2022.0667
摘要
The discrete strategy game, in which agents can only choose cooperation or defection, has received lots of attention. However, this hypothesis seems implausible in the real world, where choices may be continuous or mixed. Furthermore, when applying Q -learning to continuous or mixed strategy games, one of the challenges is that the learning space grows drastically as the number of states and actions rises. So, in this article, we redesign the Q -learning method by considering the spatial reciprocity, in which agents simply interact with their four neighbours to get the reward and learn the action by taking neighbours’ strategy into account. As a result, the learning state and action space is transformed into a 5 × 5 table that stores the state and action of the focal agent and its four neighbours, avoiding the curse of dimensionality caused by a continuous or mixed strategy game. The numerical simulation results reveal the striking differences between the three classes of games. In detail, the discrete strategy game is more vulnerable to the setting of relevant parameters, whereas the other two strategy games are relatively stable. At the same time, in terms of promoting cooperation, a mixed strategy game is always better than a continuous one.
科研通智能强力驱动
Strongly Powered by AbleSci AI