A survey on offline reinforcement learning: Taxonomy, review, and open problems
RF Prudencio, MROA Maximo… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
With the widespread adoption of deep learning, reinforcement learning (RL) has
experienced a dramatic increase in popularity, scaling to previously intractable problems …
experienced a dramatic increase in popularity, scaling to previously intractable problems …
Reinforcement learning algorithms: A brief survey
Reinforcement Learning (RL) is a machine learning (ML) technique to learn sequential
decision-making in complex problems. RL is inspired by trial-and-error based human/animal …
decision-making in complex problems. RL is inspired by trial-and-error based human/animal …
Scaling up and distilling down: Language-guided robot skill acquisition
We present a framework for robot skill acquisition, which 1) efficiently scale up data
generation of language-labelled robot data and 2) effectively distills this data down into a …
generation of language-labelled robot data and 2) effectively distills this data down into a …
A generalist agent
Inspired by progress in large-scale language modeling, we apply a similar approach
towards building a single generalist agent beyond the realm of text outputs. The agent …
towards building a single generalist agent beyond the realm of text outputs. The agent …
Planning with diffusion for flexible behavior synthesis
Model-based reinforcement learning methods often use learning only for the purpose of
estimating an approximate dynamics model, offloading the rest of the decision-making work …
estimating an approximate dynamics model, offloading the rest of the decision-making work …
Reinforced self-training (rest) for language modeling
Reinforcement learning from human feedback (RLHF) can improve the quality of large
language model's (LLM) outputs by aligning them with human preferences. We propose a …
language model's (LLM) outputs by aligning them with human preferences. We propose a …
Is conditional generative modeling all you need for decision-making?
Recent improvements in conditional generative modeling have made it possible to generate
high-quality images from language descriptions alone. We investigate whether these …
high-quality images from language descriptions alone. We investigate whether these …
Dataset distillation by matching training trajectories
Dataset distillation is the task of synthesizing a small dataset such that a model trained on
the synthetic set will match the test accuracy of the model trained on the full dataset. In this …
the synthetic set will match the test accuracy of the model trained on the full dataset. In this …
Affordances from human videos as a versatile representation for robotics
Building a robot that can understand and learn to interact by watching humans has inspired
several vision problems. However, despite some successful results on static datasets, it …
several vision problems. However, despite some successful results on static datasets, it …
Offline reinforcement learning with implicit q-learning
Offline reinforcement learning requires reconciling two conflicting aims: learning a policy that
improves over the behavior policy that collected the dataset, while at the same time …
improves over the behavior policy that collected the dataset, while at the same time …