Efficient sim-to-real transfer of contact-rich manipulation skills with online admittance residual learning

X Zhang, C Wang, L Sun, Z Wu… - … on Robot Learning, 2023 - proceedings.mlr.press
Learning contact-rich manipulation skills is essential. Such skills require the robots to
interact with the environment with feasible manipulation trajectories and suitable compliance …

Hierarchical planning through goal-conditioned offline reinforcement learning

J Li, C Tang, M Tomizuka… - IEEE Robotics and …, 2022 - ieeexplore.ieee.org
Offline Reinforcement learning (RL) has shown potent in many safe-critical tasks in robotics
where exploration is risky and expensive. However, it still struggles to acquire skills in …

Design from policies: Conservative test-time adaptation for offline policy optimization

J Liu, H Zhang, Z Zhuang, Y Kang… - Advances in Neural …, 2024 - proceedings.neurips.cc
In this work, we decouple the iterative bi-level offline RL (value estimation and policy
extraction) from the offline training phase, forming a non-iterative bi-level paradigm and …

When to trust your simulator: Dynamics-aware hybrid offline-and-online reinforcement learning

H Niu, Y Qiu, M Li, G Zhou, J Hu… - Advances in Neural …, 2022 - proceedings.neurips.cc
Learning effective reinforcement learning (RL) policies to solve real-world complex tasks
can be quite challenging without a high-fidelity simulation environment. In most cases, we …

Diffusion policies for out-of-distribution generalization in offline reinforcement learning

SE Ada, E Oztop, E Ugur - IEEE Robotics and Automation …, 2024 - ieeexplore.ieee.org
Offline Reinforcement Learning (RL) methods leverage previous experiences to learn better
policies than the behavior policy used for data collection. However, they face challenges …

Guided online distillation: Promoting safe reinforcement learning by offline demonstration

J Li, X Liu, B Zhu, J Jiao, M Tomizuka… - … on Robotics and …, 2024 - ieeexplore.ieee.org
Safe Reinforcement Learning (RL) aims to find a policy that achieves high rewards while
satisfying cost constraints. When learning from scratch, safe RL agents tend to be overly …

Residual q-learning: Offline and online policy customization without value

C Li, C Tang, H Nishimura, J Mercat… - Advances in …, 2023 - proceedings.neurips.cc
Imitation Learning (IL) is a widely used framework for learning imitative behavior from
demonstrations. It is especially appealing for solving complex real-world tasks where …

Adaptive prediction ensemble: Improving out-of-distribution generalization of motion forecasting

J Li, J Li, S Bae, D Isele - IEEE Robotics and Automation …, 2024 - ieeexplore.ieee.org
Deep learning-based trajectory prediction models for autonomous driving often struggle with
generalization to out-of-distribution (OOD) scenarios, sometimes performing worse than …

Odice: Revealing the mystery of distribution correction estimation via orthogonal-gradient update

L Mao, H Xu, W Zhang, X Zhan - arxiv preprint arxiv:2402.00348, 2024 - arxiv.org
In this study, we investigate the DIstribution Correction Estimation (DICE) methods, an
important line of work in offline reinforcement learning (RL) and imitation learning (IL). DICE …

Domain: Mildly conservative model-based offline reinforcement learning

XY Liu, XH Zhou, MJ Gui, XL **e, SQ Liu… - arxiv preprint arxiv …, 2023 - arxiv.org
Model-based reinforcement learning (RL), which learns environment model from offline
dataset and generates more out-of-distribution model data, has become an effective …