[PDF][PDF] A comprehensive survey on safe reinforcement learning
Abstract Safe Reinforcement Learning can be defined as the process of learning policies
that maximize the expectation of the return in problems in which it is important to ensure …
that maximize the expectation of the return in problems in which it is important to ensure …
A survey on transfer learning for multiagent reinforcement learning systems
Multiagent Reinforcement Learning (RL) solves complex tasks that require coordination with
other agents through autonomous exploration of the environment. However, learning a …
other agents through autonomous exploration of the environment. However, learning a …
The importance of pessimism in fixed-dataset policy optimization
We study worst-case guarantees on the expected return of fixed-dataset policy optimization
algorithms. Our core contribution is a unified conceptual and mathematical framework for the …
algorithms. Our core contribution is a unified conceptual and mathematical framework for the …
Agent-agnostic human-in-the-loop reinforcement learning
Providing Reinforcement Learning agents with expert advice can dramatically improve
various aspects of learning. Prior work has developed teaching protocols that enable agents …
various aspects of learning. Prior work has developed teaching protocols that enable agents …
A view of margin losses as regularizers of probability estimates
Regularization is commonly used in classifier design, to assure good generalization.
Classical regularization enforces a cost on classifier complexity, by constraining parameters …
Classical regularization enforces a cost on classifier complexity, by constraining parameters …
Reinforcement learning under algorithmic triage
Methods to learn under algorithmic triage have predominantly focused on supervised
learning settings where each decision, or prediction, is independent of each other. Under …
learning settings where each decision, or prediction, is independent of each other. Under …
Relational reinforcement learning for planning with exogenous effects
Probabilistic planners have improved recently to the point that they can solve difficult tasks
with complex and expressive models. In contrast, learners cannot tackle yet the expressive …
with complex and expressive models. In contrast, learners cannot tackle yet the expressive …
[PDF][PDF] Learning to switch among agents in a team
Reinforcement learning agents have been mostly developed and evaluated under the
assumption that they will operate in a fully autonomous manner—they will take all actions. In …
assumption that they will operate in a fully autonomous manner—they will take all actions. In …
Scheduled policy optimization for natural language communication with intelligent agents
We investigate the task of learning to follow natural language instructions by jointly
reasoning with visual observations and language inputs. In contrast to existing methods …
reasoning with visual observations and language inputs. In contrast to existing methods …
Transfer learning for multiagent reinforcement learning systems [J]
Learning to solve sequential decision-making tasks is difficult. Humans take years exploring
the environment essentially in a random way until they are able to reason, solve difficult …
the environment essentially in a random way until they are able to reason, solve difficult …