Decision transformer: Reinforcement learning via sequence modeling
We introduce a framework that abstracts Reinforcement Learning (RL) as a sequence
modeling problem. This allows us to draw upon the simplicity and scalability of the …
modeling problem. This allows us to draw upon the simplicity and scalability of the …
What matters in learning from offline human demonstrations for robot manipulation
Imitating human demonstrations is a promising approach to endow robots with various
manipulation capabilities. While recent advances have been made in imitation learning and …
manipulation capabilities. While recent advances have been made in imitation learning and …
Behavior Transformers: Cloning modes with one stone
While behavior learning has made impressive progress in recent times, it lags behind
computer vision and natural language processing due to its inability to leverage large …
computer vision and natural language processing due to its inability to leverage large …
Playfusion: Skill acquisition via diffusion from language-annotated play
Learning from unstructured and uncurated data has become the dominant paradigm for
generative approaches in language or vision. Such unstructured and unguided behavior …
generative approaches in language or vision. Such unstructured and unguided behavior …
Imitating human behaviour with diffusion models
Diffusion models have emerged as powerful generative models in the text-to-image domain.
This paper studies their application as observation-to-action models for imitating human …
This paper studies their application as observation-to-action models for imitating human …
Goal-conditioned reinforcement learning with imagined subgoals
Goal-conditioned reinforcement learning endows an agent with a large variety of skills, but it
often struggles to solve tasks that require more temporally extended reasoning. In this work …
often struggles to solve tasks that require more temporally extended reasoning. In this work …
Offline-to-online reinforcement learning via balanced replay and pessimistic q-ensemble
Recent advance in deep offline reinforcement learning (RL) has made it possible to train
strong robotic agents from offline datasets. However, depending on the quality of the trained …
strong robotic agents from offline datasets. However, depending on the quality of the trained …
Goal-conditioned imitation learning using score-based diffusion policies
We propose a new policy representation based on score-based diffusion models (SDMs).
We apply our new policy representation in the domain of Goal-Conditioned Imitation …
We apply our new policy representation in the domain of Goal-Conditioned Imitation …
Hiql: Offline goal-conditioned rl with latent states as actions
Unsupervised pre-training has recently become the bedrock for computer vision and natural
language processing. In reinforcement learning (RL), goal-conditioned RL can potentially …
language processing. In reinforcement learning (RL), goal-conditioned RL can potentially …
State2explanation: Concept-based explanations to benefit agent learning and user understanding
As more non-AI experts use complex AI systems for daily tasks, there has been an
increasing effort to develop methods that produce explanations of AI decision making that …
increasing effort to develop methods that produce explanations of AI decision making that …