Combining deep reinforcement learning and search for imperfect-information games
The combination of deep reinforcement learning and search at both training and test time is
a powerful paradigm that has led to a number of successes in single-agent settings and …
a powerful paradigm that has led to a number of successes in single-agent settings and …
Deep counterfactual regret minimization
Abstract Counterfactual Regret Minimization (CFR) is the leading algorithm for solving large
imperfect-information games. It converges to an equilibrium by iteratively traversing the …
imperfect-information games. It converges to an equilibrium by iteratively traversing the …
Learning in games: a systematic review
RJ Qin, Y Yu - Science China Information Sciences, 2024 - Springer
Game theory studies the mathematical models for self-interested individuals. Nash
equilibrium is arguably the most central solution in game theory. While finding the Nash …
equilibrium is arguably the most central solution in game theory. While finding the Nash …
Kernelized multiplicative weights for 0/1-polyhedral games: Bridging the gap between learning in extensive-form and normal-form games
While extensive-form games (EFGs) can be converted into normal-form games (NFGs),
doing so comes at the cost of an exponential blowup of the strategy space. So, progress on …
doing so comes at the cost of an exponential blowup of the strategy space. So, progress on …
Faster game solving via predictive blackwell approachability: Connecting regret matching and mirror descent
Blackwell approachability is a framework for reasoning about repeated games with vector-
valued payoffs. We introduce predictive Blackwell approachability, where an estimate of the …
valued payoffs. We introduce predictive Blackwell approachability, where an estimate of the …
Near-optimal learning of extensive-form games with imperfect information
This paper resolves the open question of designing near-optimal algorithms for learning
imperfect-information extensive-form games from bandit feedback. We present the first line …
imperfect-information extensive-form games from bandit feedback. We present the first line …
Learning in two-player zero-sum partially observable Markov games with perfect recall
We study the problem of learning a Nash equilibrium (NE) in an extensive game with
imperfect information (EGII) through self-play. Precisely, we focus on two-player, zero-sum …
imperfect information (EGII) through self-play. Precisely, we focus on two-player, zero-sum …
Block-coordinate methods and restarting for solving extensive-form games
Coordinate descent methods are popular in machine learning and optimization for their
simple sparse updates and excellent practical performance. In the context of large-scale …
simple sparse updates and excellent practical performance. In the context of large-scale …
Optimistic regret minimization for extensive-form games via dilated distance-generating functions
We study the performance of optimistic regret-minimization algorithms for both minimizing
regret in, and computing Nash equilibria of, zero-sum extensive-form games. In order to …
regret in, and computing Nash equilibria of, zero-sum extensive-form games. In order to …
Better regularization for sequential decision spaces: Fast convergence rates for Nash, correlated, and team equilibria
We study the application of iterative first-order methods to the problem of computing
equilibria of large-scale two-player extensive-form games. First-order methods must typically …
equilibria of large-scale two-player extensive-form games. First-order methods must typically …