An overview of multi-agent reinforcement learning from game theoretical perspective
Y Yang, J Wang - arxiv preprint arxiv:2011.00583, 2020 - arxiv.org
Following the remarkable success of the AlphaGO series, 2019 was a booming year that
witnessed significant advances in multi-agent reinforcement learning (MARL) techniques …
witnessed significant advances in multi-agent reinforcement learning (MARL) techniques …
What are higher-order networks?
Network-based modeling of complex systems and data using the language of graphs has
become an essential topic across a range of different disciplines. Arguably, this graph-based …
become an essential topic across a range of different disciplines. Arguably, this graph-based …
The mechanics of n-player differentiable games
The cornerstone underpinning deep learning is the guarantee that gradient descent on an
objective converges to local minima. Unfortunately, this guarantee fails in settings, such as …
objective converges to local minima. Unfortunately, this guarantee fails in settings, such as …
Open-ended learning in symmetric zero-sum games
Zero-sum games such as chess and poker are, abstractly, functions that evaluate pairs of
agents, for example labeling them 'winner'and 'loser'. If the game is approximately transitive …
agents, for example labeling them 'winner'and 'loser'. If the game is approximately transitive …
Hodge Laplacians on graphs
LH Lim - Siam Review, 2020 - SIAM
This is an elementary introduction to the Hodge Laplacian on a graph, a higher-order
generalization of the graph Laplacian. We will discuss basic properties including …
generalization of the graph Laplacian. We will discuss basic properties including …
On last-iterate convergence beyond zero-sum games
Most existing results about last-iterate convergence of learning dynamics are limited to two-
player zero-sum games, and only apply under rigid assumptions about what dynamics the …
player zero-sum games, and only apply under rigid assumptions about what dynamics the …
α-Rank: Multi-Agent Evaluation by Evolution
We introduce α-Rank, a principled evolutionary dynamics methodology, for the evaluation
and ranking of agents in large-scale multi-agent interactions, grounded in a novel dynamical …
and ranking of agents in large-scale multi-agent interactions, grounded in a novel dynamical …
Modelling behavioural diversity for learning in open-ended games
Promoting behavioural diversity is critical for solving games with non-transitive dynamics
where strategic cycles exist, and there is no consistent winner (eg, Rock-Paper-Scissors) …
where strategic cycles exist, and there is no consistent winner (eg, Rock-Paper-Scissors) …
Policy space diversity for non-transitive games
Abstract Policy-Space Response Oracles (PSRO) is an influential algorithm framework for
approximating a Nash Equilibrium (NE) in multi-agent non-transitive games. Many previous …
approximating a Nash Equilibrium (NE) in multi-agent non-transitive games. Many previous …
Re-evaluating evaluation
Progress in machine learning is measured by careful evaluation on problems of outstanding
common interest. However, the proliferation of benchmark suites and environments …
common interest. However, the proliferation of benchmark suites and environments …