A policy-guided imitation approach for offline reinforcement learning
Offline reinforcement learning (RL) methods can generally be categorized into two types: RL-
based and Imitation-based. RL-based methods could in principle enjoy out-of-distribution …
based and Imitation-based. RL-based methods could in principle enjoy out-of-distribution …
Is model ensemble necessary? model-based rl via a single model with lipschitz regularized value function
Probabilistic dynamics model ensemble is widely used in existing model-based
reinforcement learning methods as it outperforms a single dynamics model in both …
reinforcement learning methods as it outperforms a single dynamics model in both …
Diffusion imitation from observation
Learning from observation (LfO) aims to imitate experts by learning from state-only
demonstrations without requiring action labels. Existing adversarial imitation learning …
demonstrations without requiring action labels. Existing adversarial imitation learning …
Towards Generalist Robot Learning from Internet Video: A Survey
This survey presents an overview of methods for learning from video (LfV) in the context of
reinforcement learning (RL) and robotics. We focus on methods capable of scaling to large …
reinforcement learning (RL) and robotics. We focus on methods capable of scaling to large …
Imitation learning from observation with automatic discount scheduling
Humans often acquire new skills through observation and imitation. For robotic agents,
learning from the plethora of unlabeled video demonstration data available on the Internet …
learning from the plethora of unlabeled video demonstration data available on the Internet …
DGTRL: Deep graph transfer reinforcement learning method based on fusion of knowledge and data
G Chen, J Qi, Y Gao, X Zhu, Z Dong, Y Sun - Information Sciences, 2024 - Elsevier
Deep reinforcement learning has shown promising application effects in many fields.
However, issues such as low sample efficiency and weak knowledge transfer and …
However, issues such as low sample efficiency and weak knowledge transfer and …
Learning rational subgoals from demonstrations and instructions
We present a framework for learning useful subgoals that support efficient long-term
planning to achieve novel goals. At the core of our framework is a collection of rational …
planning to achieve novel goals. At the core of our framework is a collection of rational …
NOLO: Navigate Only Look Once
The in-context learning ability of Transformer models has brought new possibilities to visual
navigation. In this paper, we focus on the video navigation setting, where an in-context …
navigation. In this paper, we focus on the video navigation setting, where an in-context …
GAN-MPC: Training Model Predictive Controllers with Parameterized Cost Functions using Demonstrations from Non-identical Experts
Model predictive control (MPC) is a popular approach for trajectory optimization in practical
robotics applications. MPC policies can optimize trajectory parameters under kinodynamic …
robotics applications. MPC policies can optimize trajectory parameters under kinodynamic …
[CITATION][C] 模仿学**综述: 传统与新进展
张超, 白文松, 杜歆, 柳伟杰, 周晨浩, 钱徽 - **图象图形学报, 2023