A survey of inverse reinforcement learning: Challenges, methods and progress

S Arora, P Doshi - Artificial Intelligence, 2021 - Elsevier
Inverse reinforcement learning (IRL) is the problem of inferring the reward function of an
agent, given its policy or observed behavior. Analogous to RL, IRL is perceived both as a …

Decentralized control of partially observable Markov decision processes

C Amato, G Chowdhary, A Geramifard… - … IEEE Conference on …, 2013 - ieeexplore.ieee.org
Markov decision processes (MDPs) are often used to model sequential decision problems
involving uncertainty under the assumption of centralized control. However, many large …

[KÖNYV][B] A concise introduction to decentralized POMDPs

FA Oliehoek, C Amato - 2016 - Springer
This book presents an overview of formal decision making methods for decentralized
cooperative systems. It is aimed at graduate students and researchers in the fields of …

Reinforcement learning

MA Wiering, M Van Otterlo - Adaptation, learning, and optimization, 2012 - Springer
Reinforcement learning Marco Wiering Martijn van Otterlo (Eds.) Reinforcement Learning
State-of-the-Art ADAPTATION, LEARNING, AND OPTIMIZATION Volume 12 123 Page 2 …

Optimal and approximate Q-value functions for decentralized POMDPs

FA Oliehoek, MTJ Spaan, N Vlassis - Journal of Artificial Intelligence …, 2008 - jair.org
Decision-theoretic planning is a popular approach to sequential decision making problems,
because it treats uncertainty in sensing and acting in a principled way. In single-agent …

Game theory and multi-agent reinforcement learning

A Nowé, P Vrancx, YM De Hauwere - Reinforcement learning: State-of-the …, 2012 - Springer
Reinforcement Learning was originally developed for Markov Decision Processes (MDPs). It
allows a single agent to learn a policy that maximizes a possibly delayed reward signal in a …

Credit assignment for collective multiagent RL with global rewards

DT Nguyen, A Kumar, HC Lau - Advances in neural …, 2018 - proceedings.neurips.cc
Scaling decision theoretic planning to large multiagent systems is challenging due to
uncertainty and partial observability in the environment. We focus on a multiagent planning …

Coordinated multi-robot exploration under communication constraints using decentralized markov decision processes

L Matignon, L Jeanpierre, AI Mouaddib - Proceedings of the AAAI …, 2012 - ojs.aaai.org
Recent works on multi-agent sequential decision making using decentralized partially
observable Markov decision processes have been concerned with interaction-oriented …

Modeling and planning with macro-actions in decentralized POMDPs

C Amato, G Konidaris, LP Kaelbling, JP How - Journal of Artificial …, 2019 - jair.org
Decentralized partially observable Markov decision processes (Dec-POMDPs) are general
models for decentralized multi-agent decision making under uncertainty. However, they …

Online planning for multi-agent systems with bounded communication

F Wu, S Zilberstein, X Chen - Artificial Intelligence, 2011 - Elsevier
We propose an online algorithm for planning under uncertainty in multi-agent settings
modeled as DEC-POMDPs. The algorithm helps overcome the high computational …