A survey on multi-agent deep reinforcement learning: from the perspective of challenges and applications

W Du, S Ding - Artificial Intelligence Review, 2021 - Springer
Deep reinforcement learning has proved to be a fruitful method in various tasks in the field of
artificial intelligence during the last several years. Recent works have focused on deep …

A survey and critique of multiagent deep reinforcement learning

P Hernandez-Leal, B Kartal, ME Taylor - Autonomous Agents and Multi …, 2019 - Springer
Deep reinforcement learning (RL) has achieved outstanding results in recent years. This has
led to a dramatic increase in the number of applications and methods. Recent works have …

A survey of learning in multiagent environments: Dealing with non-stationarity

P Hernandez-Leal, M Kaisers, T Baarslag… - arxiv preprint arxiv …, 2017 - arxiv.org
The key challenge in multiagent learning is learning a best response to the behaviour of
other agents, which may be non-stationary: if the other agents adapt their strategy as well …

Autonomy and intelligence in the computing continuum: Challenges, enablers, and future directions for orchestration

H Kokkonen, L Lovén, NH Motlagh, A Kumar… - arxiv preprint arxiv …, 2022 - arxiv.org
Future AI applications require performance, reliability and privacy that the existing, cloud-
dependant system architectures cannot provide. In this article, we study orchestration in the …

[PDF][PDF] Is multiagent deep reinforcement learning the answer or the question? A brief survey

P Hernandez-Leal, B Kartal, ME Taylor - learning, 2018 - researchgate.net
Deep reinforcement learning (RL) has achieved outstanding results in recent years. This has
led to a dramatic increase in the number of applications and methods. Recent works have …

[PDF][PDF] Coordinated Versus Decentralized Exploration In Multi-Agent Multi-Armed Bandits.

M Chakraborty, KYP Chua, S Das, B Juba - IJCAI, 2017 - ijcai.org
In this paper, we introduce a multi-agent multiarmed bandit-based model for ad hoc
teamwork with expensive communication. The goal of the team is to maximize the total …

Towards efficient detection and optimal response against sophisticated opponents

T Yang, Z Meng, J Hao, C Zhang, Y Zheng… - arxiv preprint arxiv …, 2018 - arxiv.org
Multiagent algorithms often aim to accurately predict the behaviors of other agents and find a
best response accordingly. Previous works usually assume an opponent uses a stationary …

[PDF][PDF] Learning Against Non-Stationary Agents with Opponent Modelling and Deep Reinforcement Learning.

R Everett, SJ Roberts - AAAI Spring Symposia, 2018 - oxford-man.ox.ac.uk
Humans, like all animals, both cooperate and compete with each other. Through these
interactions we learn to observe, act, and manipulate to maximise our utility function, and …

Opponent Modeling with In-context Search

Y **g, B Liu, K Li, Y Zang, H Fu, Q Fu… - Advances in …, 2025 - proceedings.neurips.cc
Opponent modeling is a longstanding research topic aimed at enhancing decision-making
by modeling information about opponents in multi-agent environments. However, existing …

Towards cooperation in sequential prisoner's dilemmas: a deep multiagent reinforcement learning approach

W Wang, J Hao, Y Wang, M Taylor - arxiv preprint arxiv:1803.00162, 2018 - arxiv.org
The Iterated Prisoner's Dilemma has guided research on social dilemmas for decades.
However, it distinguishes between only two atomic actions: cooperate and defect. In real …