Reward-agnostic fine-tuning: Provable statistical benefits of hybrid reinforcement learning
This paper studies tabular reinforcement learning (RL) in the hybrid setting, which assumes
access to both an offline dataset and online interactions with the unknown environment. A …
access to both an offline dataset and online interactions with the unknown environment. A …
Provably safe reinforcement learning with step-wise violation constraints
We investigate a novel safe reinforcement learning problem with step-wise violation
constraints. Our problem differs from existing works in that we focus on stricter step-wise …
constraints. Our problem differs from existing works in that we focus on stricter step-wise …
Minimax-optimal reward-agnostic exploration in reinforcement learning
This paper studies reward-agnostic exploration in reinforcement learning (RL)—a scenario
where the learner is unware of the reward functions during the exploration stage—and …
where the learner is unware of the reward functions during the exploration stage—and …
Provable safe reinforcement learning with binary feedback
Safety is a crucial necessity in many applications of reinforcement learning (RL), whether
robotic, automotive, or medical. Many existing approaches to safe RL rely on receiving …
robotic, automotive, or medical. Many existing approaches to safe RL rely on receiving …
Near-optimal conservative exploration in reinforcement learning under episode-wise constraints
This paper investigates conservative exploration in reinforcement learning where the
performance of the learning agent is guaranteed to be above a certain threshold throughout …
performance of the learning agent is guaranteed to be above a certain threshold throughout …
AED: Adaptable Error Detection for Few-shot Imitation Policy
We study how to report few-shot imitation (FSI) policies' behavior errors in novel
environments, a novel task named adaptable error detection (AED). The potential to cause …
environments, a novel task named adaptable error detection (AED). The potential to cause …