A survey of meta-reinforcement learning

J Beck, R Vuorio, EZ Liu, Z **ong, L Zintgraf… - arxiv preprint arxiv …, 2023 - arxiv.org
While deep reinforcement learning (RL) has fueled multiple high-profile successes in
machine learning, it is held back from more widespread adoption by its often poor data …

Discovered policy optimisation

C Lu, J Kuba, A Letcher, L Metz… - Advances in …, 2022 - proceedings.neurips.cc
Tremendous progress has been made in reinforcement learning (RL) over the past decade.
Most of these advancements came through the continual development of new algorithms …

Meta-reward-net: Implicitly differentiable reward learning for preference-based reinforcement learning

R Liu, F Bai, Y Du, Y Yang - Advances in Neural …, 2022 - proceedings.neurips.cc
Abstract Setting up a well-designed reward function has been challenging for many
reinforcement learning applications. Preference-based reinforcement learning (PbRL) …

Student of Games: A unified learning algorithm for both perfect and imperfect information games

M Schmid, M Moravčík, N Burch, R Kadlec… - Science …, 2023 - science.org
Games have a long history as benchmarks for progress in artificial intelligence. Approaches
using search and learning produced strong performance across many perfect information …

Offline pre-trained multi-agent decision transformer

L Meng, M Wen, C Le, X Li, D **ng, W Zhang… - Machine Intelligence …, 2023 - Springer
Offline reinforcement learning leverages previously collected offline datasets to learn
optimal policies with no necessity to access the real environment. Such a paradigm is also …

Policy space diversity for non-transitive games

J Yao, W Liu, H Fu, Y Yang… - Advances in Neural …, 2024 - proceedings.neurips.cc
Abstract Policy-Space Response Oracles (PSRO) is an influential algorithm framework for
approximating a Nash Equilibrium (NE) in multi-agent non-transitive games. Many previous …

Mate: Benchmarking multi-agent reinforcement learning in distributed target coverage control

X Pan, M Liu, F Zhong, Y Yang… - Advances in Neural …, 2022 - proceedings.neurips.cc
Abstract We introduce the Multi-Agent Tracking Environment (MATE), a novel multi-agent
environment simulates the target coverage control problems in the real world. MATE hosts …

A survey of decision making in adversarial games

X Li, M Meng, Y Hong, J Chen - Science China Information Sciences, 2024 - Springer
In many practical applications, such as poker, chess, drug interdiction, cybersecurity, and
national defense, players often have adversarial stances, ie, the selfish actions of each …

Team-PSRO for learning approximate TMECor in large team games via cooperative reinforcement learning

S McAleer, G Farina, G Zhou, M Wang… - Advances in …, 2023 - proceedings.neurips.cc
Recent algorithms have achieved superhuman performance at a number of two-player zero-
sum games such as poker and go. However, many real-world situations are multi-player …

Turbocharging solution concepts: Solving NEs, CEs and CCEs with neural equilibrium solvers

L Marris, I Gemp, T Anthony… - Advances in …, 2022 - proceedings.neurips.cc
Solution concepts such as Nash Equilibria, Correlated Equilibria, and Coarse Correlated
Equilibria are useful components for many multiagent machine learning algorithms …