Combining deep reinforcement learning and search for imperfect-information games

N Brown, A Bakhtin, A Lerer… - Advances in Neural …, 2020 - proceedings.neurips.cc
The combination of deep reinforcement learning and search at both training and test time is
a powerful paradigm that has led to a number of successes in single-agent settings and …

Deep counterfactual regret minimization

N Brown, A Lerer, S Gross… - … conference on machine …, 2019 - proceedings.mlr.press
Abstract Counterfactual Regret Minimization (CFR) is the leading algorithm for solving large
imperfect-information games. It converges to an equilibrium by iteratively traversing the …

Learning in games: a systematic review

RJ Qin, Y Yu - Science China Information Sciences, 2024 - Springer
Game theory studies the mathematical models for self-interested individuals. Nash
equilibrium is arguably the most central solution in game theory. While finding the Nash …

Kernelized multiplicative weights for 0/1-polyhedral games: Bridging the gap between learning in extensive-form and normal-form games

G Farina, CW Lee, H Luo… - … Conference on Machine …, 2022 - proceedings.mlr.press
While extensive-form games (EFGs) can be converted into normal-form games (NFGs),
doing so comes at the cost of an exponential blowup of the strategy space. So, progress on …

Faster game solving via predictive blackwell approachability: Connecting regret matching and mirror descent

G Farina, C Kroer, T Sandholm - … of the AAAI Conference on Artificial …, 2021 - ojs.aaai.org
Blackwell approachability is a framework for reasoning about repeated games with vector-
valued payoffs. We introduce predictive Blackwell approachability, where an estimate of the …

Near-optimal learning of extensive-form games with imperfect information

Y Bai, C **, S Mei, T Yu - International Conference on …, 2022 - proceedings.mlr.press
This paper resolves the open question of designing near-optimal algorithms for learning
imperfect-information extensive-form games from bandit feedback. We present the first line …

Learning in two-player zero-sum partially observable Markov games with perfect recall

T Kozuno, P Ménard, R Munos… - Advances in Neural …, 2021 - proceedings.neurips.cc
We study the problem of learning a Nash equilibrium (NE) in an extensive game with
imperfect information (EGII) through self-play. Precisely, we focus on two-player, zero-sum …

Block-coordinate methods and restarting for solving extensive-form games

D Chakrabarti, J Diakonikolas… - Advances in Neural …, 2023 - proceedings.neurips.cc
Coordinate descent methods are popular in machine learning and optimization for their
simple sparse updates and excellent practical performance. In the context of large-scale …

Optimistic regret minimization for extensive-form games via dilated distance-generating functions

G Farina, C Kroer, T Sandholm - Advances in neural …, 2019 - proceedings.neurips.cc
We study the performance of optimistic regret-minimization algorithms for both minimizing
regret in, and computing Nash equilibria of, zero-sum extensive-form games. In order to …

Better regularization for sequential decision spaces: Fast convergence rates for Nash, correlated, and team equilibria

G Farina, C Kroer, T Sandholm - arxiv preprint arxiv:2105.12954, 2021 - arxiv.org
We study the application of iterative first-order methods to the problem of computing
equilibria of large-scale two-player extensive-form games. First-order methods must typically …