惩罚(心理学)
透视图(图形)
强化学习
钢筋
计算机科学
人工智能
数理经济学
认知科学
心理学
数学
社会心理学
作者
Chenyang Zhao,Guozhong Zheng,Zhang Chun,Jiqiang Zhang,Li Chen
出处
期刊:Chaos
[American Institute of Physics]
日期:2024-07-01
卷期号:34 (7)
被引量:2
摘要
Punishment is a common tactic to sustain cooperation and has been extensively studied for a long time. While most of previous game-theoretic work adopt the imitation learning framework where players imitate the strategies of those who are better off, the learning logic in the real world is often much more complex. In this work, we turn to the reinforcement learning paradigm, where individuals make their decisions based upon their experience and long-term returns. Specifically, we investigate the prisoners' dilemma game with a Q-learning algorithm, and cooperators probabilistically pose punishment on defectors in their neighborhood. Unexpectedly, we find that punishment could lead to either continuous or discontinuous cooperation phase transitions, and the nucleation process of cooperation clusters is reminiscent of the liquid-gas transition. The analysis of a Q-table reveals the evolution of the underlying "psychologic" changes, which explains the nucleation process and different levels of cooperation. The uncovered first-order phase transition indicates that great care needs to be taken when implementing the punishment compared to the continuous scenario.
科研通智能强力驱动
Strongly Powered by AbleSci AI