**rl: Cross-embodiment inverse reinforcement learning

K Zakka, A Zeng, P Florence… - … on Robot Learning, 2022 - proceedings.mlr.press
We investigate the visual cross-embodiment imitation setting, in which agents learn policies
from videos of other agents (such as humans) demonstrating the same task, but with stark …

Generalized hindsight for reinforcement learning

A Li, L Pinto, P Abbeel - Advances in neural information …, 2020 - proceedings.neurips.cc
One of the key reasons for the high sample complexity in reinforcement learning (RL) is the
inability to transfer knowledge from one task to another. In standard multi-task RL settings …

Cross-domain imitation learning via optimal transport

A Fickinger, S Cohen, S Russell, B Amos - arxiv preprint arxiv:2110.03684, 2021 - arxiv.org
Cross-domain imitation learning studies how to leverage expert demonstrations of one
agent to train an imitation agent with a different embodiment or morphology. Comparing …

Cross-domain policy adaptation via value-guided data filtering

K Xu, C Bai, X Ma, D Wang, B Zhao… - Advances in …, 2023 - proceedings.neurips.cc
Generalizing policies across different domains with dynamics mismatch poses a significant
challenge in reinforcement learning. For example, a robot learns the policy in a simulator …

A Comprehensive Survey of Cross-Domain Policy Transfer for Embodied Agents

H Niu, J Hu, G Zhou, X Zhan - arxiv preprint arxiv:2402.04580, 2024 - arxiv.org
The burgeoning fields of robot learning and embodied AI have triggered an increasing
demand for large quantities of data. However, collecting sufficient unbiased data from the …

Crossloco: Human motion driven control of legged robots via guided unsupervised reinforcement learning

T Li, H Jung, M Gombolay, YK Cho, S Ha - arxiv preprint arxiv:2309.17046, 2023 - arxiv.org
Human motion driven control (HMDC) is an effective approach for generating natural and
compelling robot motions while preserving high-level semantics. However, establishing the …

Energy-Aware Hierarchical Reinforcement Learning Based on the Predictive Energy Consumption Algorithm for Search and Rescue Aerial Robots in Unknown …

M Ramezani, MA Amiri Atashgah - Drones, 2024 - mdpi.com
Aerial robots (drones) offer critical advantages in missions where human participation is
impeded due to hazardous conditions. Among these, search and rescue missions in disaster …

Revolver: Continuous evolutionary models for robot-to-robot policy transfer

X Liu, D Pathak, KM Kitani - arxiv preprint arxiv:2202.05244, 2022 - arxiv.org
A popular paradigm in robotic learning is to train a policy from scratch for every new robot.
This is not only inefficient but also often impractical for complex robots. In this work, we …

CO-PILOT: Collaborative planning and reinforcement learning on sub-task curriculum

S Ao, T Zhou, G Long, Q Lu, L Zhu… - Advances in Neural …, 2021 - proceedings.neurips.cc
Goal-conditioned reinforcement learning (RL) usually suffers from sparse reward and
inefficient exploration in long-horizon tasks. Planning can find the shortest path to a distant …

Mirage: Cross-Embodiment Zero-Shot Policy Transfer with Cross-Painting

LY Chen, K Hari, K Dharmarajan, C Xu… - arxiv preprint arxiv …, 2024 - arxiv.org
The ability to reuse collected data and transfer trained policies between robots could
alleviate the burden of additional data collection and training. While existing approaches …