Offline reinforcement learning: Tutorial, review, and perspectives on open problems
In this tutorial article, we aim to provide the reader with the conceptual tools needed to get
started on research on offline reinforcement learning algorithms: reinforcement learning …
started on research on offline reinforcement learning algorithms: reinforcement learning …
A review of robot learning for manipulation: Challenges, representations, and algorithms
A key challenge in intelligent robotics is creating robots that are capable of directly
interacting with the world around them to achieve their goals. The last decade has seen …
interacting with the world around them to achieve their goals. The last decade has seen …
Rt-1: Robotics transformer for real-world control at scale
A Brohan, N Brown, J Carbajal, Y Chebotar… - arxiv preprint arxiv …, 2022 - arxiv.org
By transferring knowledge from large, diverse, task-agnostic datasets, modern machine
learning models can solve specific downstream tasks either zero-shot or with small task …
learning models can solve specific downstream tasks either zero-shot or with small task …
The artificial intelligence clinician learns optimal treatment strategies for sepsis in intensive care
Sepsis is the third leading cause of death worldwide and the main cause of mortality in
hospitals,–, but the best treatment strategy remains uncertain. In particular, evidence …
hospitals,–, but the best treatment strategy remains uncertain. In particular, evidence …
Coindice: Off-policy confidence interval estimation
We study high-confidence behavior-agnostic off-policy evaluation in reinforcement learning,
where the goal is to estimate a confidence interval on a target policy's value, given only …
where the goal is to estimate a confidence interval on a target policy's value, given only …
Verifying learning-augmented systems
T Eliyahu, Y Kazak, G Katz, M Schapira - Proceedings of the 2021 ACM …, 2021 - dl.acm.org
The application of deep reinforcement learning (DRL) to computer and networked systems
has recently gained significant popularity. However, the obscurity of decisions by DRL …
has recently gained significant popularity. However, the obscurity of decisions by DRL …
Universal off-policy evaluation
When faced with sequential decision-making problems, it is often useful to be able to predict
what would happen if decisions were made using a new policy. Those predictions must …
what would happen if decisions were made using a new policy. Those predictions must …
Learning when-to-treat policies
Many applied decision-making problems have a dynamic component: The policymaker
needs not only to choose whom to treat, but also when to start which treatment. For example …
needs not only to choose whom to treat, but also when to start which treatment. For example …
Off-policy policy evaluation for sequential decisions under unobserved confounding
H Namkoong, R Keramati… - Advances in Neural …, 2020 - proceedings.neurips.cc
When observed decisions depend only on observed features, off-policy policy evaluation
(OPE) methods for sequential decision problems can estimate the performance of evaluation …
(OPE) methods for sequential decision problems can estimate the performance of evaluation …
An instrumental variable approach to confounded off-policy evaluation
Off-policy evaluation (OPE) aims to estimate the return of a target policy using some pre-
collected observational data generated by a potentially different behavior policy. In many …
collected observational data generated by a potentially different behavior policy. In many …