DeepNash: Learning to Play the Imperfect Information Game Stratego
Mastering the Game of Stratego with Model-Free Multiagent Reinforcement Learning
We introduce a game-theoretic, model-free deepreinforcement learning method, without search, that learns to master the board game stratego from scratch, up to a human expert level.
The regularised nash dynamics (r-nad) algorithm, a key component of deepnash, converges to an approximate nash equilibrium, instead of'cycling'around it, by directly modifying the underlying multi-agent learning dynamics.
Deep nash beats existing state-of-the-art methods in stratego and achieved a yearly (2022) and all-time top-3 rank on the gravon games platform, competing with human expert players.
Authors
Julien Perolat, Bart de Vylder, Daniel Hennes, Eugene Tarassov, Florian Strub, Vincent de Boer, Paul Muller, Jerome T. Connor, Neil Burch, Thomas Anthony, Stephen McAleer, Romuald Elie, Sarah H. Cen, Zhe Wang, Audrunas Gruslys, Aleksandra Malysheva, Mina Khan, Sherjil Ozair, Finbarr Timbers, Toby Pohlen