A survey of meta-reinforcement learning
While deep reinforcement learning (RL) has fueled multiple high-profile successes in
machine learning, it is held back from more widespread adoption by its often poor data …
machine learning, it is held back from more widespread adoption by its often poor data …
Discovered policy optimisation
Tremendous progress has been made in reinforcement learning (RL) over the past decade.
Most of these advancements came through the continual development of new algorithms …
Most of these advancements came through the continual development of new algorithms …
Meta-reward-net: Implicitly differentiable reward learning for preference-based reinforcement learning
Abstract Setting up a well-designed reward function has been challenging for many
reinforcement learning applications. Preference-based reinforcement learning (PbRL) …
reinforcement learning applications. Preference-based reinforcement learning (PbRL) …
Student of Games: A unified learning algorithm for both perfect and imperfect information games
Games have a long history as benchmarks for progress in artificial intelligence. Approaches
using search and learning produced strong performance across many perfect information …
using search and learning produced strong performance across many perfect information …
Offline pre-trained multi-agent decision transformer
Offline reinforcement learning leverages previously collected offline datasets to learn
optimal policies with no necessity to access the real environment. Such a paradigm is also …
optimal policies with no necessity to access the real environment. Such a paradigm is also …
Policy space diversity for non-transitive games
Abstract Policy-Space Response Oracles (PSRO) is an influential algorithm framework for
approximating a Nash Equilibrium (NE) in multi-agent non-transitive games. Many previous …
approximating a Nash Equilibrium (NE) in multi-agent non-transitive games. Many previous …
Mate: Benchmarking multi-agent reinforcement learning in distributed target coverage control
Abstract We introduce the Multi-Agent Tracking Environment (MATE), a novel multi-agent
environment simulates the target coverage control problems in the real world. MATE hosts …
environment simulates the target coverage control problems in the real world. MATE hosts …
A survey of decision making in adversarial games
In many practical applications, such as poker, chess, drug interdiction, cybersecurity, and
national defense, players often have adversarial stances, ie, the selfish actions of each …
national defense, players often have adversarial stances, ie, the selfish actions of each …
Team-PSRO for learning approximate TMECor in large team games via cooperative reinforcement learning
Recent algorithms have achieved superhuman performance at a number of two-player zero-
sum games such as poker and go. However, many real-world situations are multi-player …
sum games such as poker and go. However, many real-world situations are multi-player …
Turbocharging solution concepts: Solving NEs, CEs and CCEs with neural equilibrium solvers
Solution concepts such as Nash Equilibria, Correlated Equilibria, and Coarse Correlated
Equilibria are useful components for many multiagent machine learning algorithms …
Equilibria are useful components for many multiagent machine learning algorithms …