A practical guide to multi-objective reinforcement learning and planning
Real-world sequential decision-making tasks are generally complex, requiring trade-offs
between multiple, often conflicting, objectives. Despite this, the majority of research in …
between multiple, often conflicting, objectives. Despite this, the majority of research in …
Finite-time frequentist regret bounds of multi-agent thompson sampling on sparse hypergraphs
We study the multi-agent multi-armed bandit (MAMAB) problem, where agents are factored
into overlap** groups. Each group represents a hyperedge, forming a hypergraph over …
into overlap** groups. Each group represents a hyperedge, forming a hypergraph over …
Statistical and computational trade-off in multi-agent multi-armed bandits
We study the problem of regret minimization in Multi-Agent Multi-Armed Bandits (MAMABs)
where the rewards are defined through a factor graph. We derive an instance-specific regret …
where the rewards are defined through a factor graph. We derive an instance-specific regret …
Context aware control systems: An engineering applications perspective
Cyber-physical systems revolve around context awareness, empowering objective-oriented
services, products and operations based on real data. Self-aware and self-control systems …
services, products and operations based on real data. Self-aware and self-control systems …
Multi-agent thompson sampling for bandit applications with sparse neighbourhood structures
Multi-agent coordination is prevalent in many real-world applications. However, such
coordination is challenging due to its combinatorial nature. An important observation in this …
coordination is challenging due to its combinatorial nature. An important observation in this …
MOMAland: A Set of Benchmarks for Multi-Objective Multi-Agent Reinforcement Learning
Many challenging tasks such as managing traffic systems, electricity grids, or supply chains
involve complex decision-making processes that must balance multiple conflicting …
involve complex decision-making processes that must balance multiple conflicting …
Best arm identification in multi-agent multi-armed bandits
We investigate the problem of best arm identification in Multi-Agent Multi-Armed Bandits
(MAMABs) where the rewards are defined through a factor graph. The objective is to find an …
(MAMABs) where the rewards are defined through a factor graph. The objective is to find an …
[PDF][PDF] Deep reinforcement learning for active wake control
G Neustroev, SPE Andringa, RA Verzijlbergh… - Proceedings of the 21st …, 2022 - ifaamas.org
Wind farms suffer from so-called wake effects: when turbines are located in the wind
shadows of other turbines, their power output is substantially reduced. These losses can be …
shadows of other turbines, their power output is substantially reduced. These losses can be …
Budget allocation as a multi-agent system of contextual & continuous bandits
B Han, C Arndt - Proceedings of the 27th ACM SIGKDD Conference on …, 2021 - dl.acm.org
Budget allocation for online advertising suffers from multiple complications, including
significant delay between the initial ad impression to the call to action as well as cold-start …
significant delay between the initial ad impression to the call to action as well as cold-start …
AI-Toolbox: A C++ library for reinforcement learning and planning (with Python bindings)
This paper describes AI-Toolbox, a C++ software library that contains reinforcement learning
and planning algorithms, and supports both single and multi agent problems, as well as …
and planning algorithms, and supports both single and multi agent problems, as well as …