杠杆(统计)
计算机科学
人口
多样性(政治)
利基
质量(理念)
心理学
人工智能
生态学
政治学
社会学
生物
认识论
哲学
人口学
法学
作者
Rodrigo Canaan,Xianbo Gao,Julian Togelius,Andy Nealen,Stefan Menzel
出处
期刊:IEEE transactions on games
[Institute of Electrical and Electronics Engineers]
日期:2022-04-25
卷期号:15 (2): 228-241
被引量:2
标识
DOI:10.1109/tg.2022.3169168
摘要
Hanabi is a cooperative game that brings the problem of modeling other players to the forefront.In this game, coordinated groups of players can leverage pre-established conventions to great effect.In this paper, we focus on adhoc settings with no previous coordination between partners.We introduce a "Bayesian Meta-Agent" that maintains a belief distribution over hypotheses of partner policies.The policies that serve as initial hypotheses are generated using MAP-Elites, to ensure behavioral diversity.We evaluate an "Adaptive" version of the agent, which selects a response policy based on the updated belief distribution and a "Generalist" version, which selects a response based on the uniform prior.In short episodes of 10 games with a consistent partner, the "Adaptive" version outperforms the "Generalist" when the training and evaluation populations are the same.This presents a first step towards an agent that can model its partner and adapt within a time frame that is compatible with human interaction.
科研通智能强力驱动
Strongly Powered by AbleSci AI