作者
Julien Pérolat,Bart De Vylder,Daniel Hennes,Eugene Tarassov,Florian Strub,Vincent C.J. de Boer,Paul Müller,Jerome T. Connor,Neil Burch,Thomas Anthony,Stephen McAleer,Romuald Élie,Sarah H. Cen,Zhe Wang,Audrūnas Gruslys,Aleksandra Malysheva,Mina Khan,Sherjil Ozair,Finbarr Timbers,Toby Pohlen,Tom Eccles,Mark Rowland,Marc Lanctot,Jean-Baptiste Lespiau,Bilal Piot,Shayegan Omidshafiei,Edward Lockhart,Laurent Sifre,Nathalie Beauguerlange,Rémi Munos,David Silver,Satinder Singh,Demis Hassabis,Karl Tuyls
摘要
We introduce DeepNash, an autonomous agent that plays the imperfect information game Stratego at a human expert level. Stratego is one of the few iconic board games that artificial intelligence (AI) has not yet mastered. It is a game characterized by a twin challenge: It requires long-term strategic thinking as in chess, but it also requires dealing with imperfect information as in poker. The technique underpinning DeepNash uses a game-theoretic, model-free deep reinforcement learning method, without search, that learns to master Stratego through self-play from scratch. DeepNash beat existing state-of-the-art AI methods in Stratego and achieved a year-to-date (2022) and all-time top-three ranking on the Gravon games platform, competing with human expert players.