Peac: Unsupervised pre-training for cross-embodiment reinforcement learning

C Ying, Z Hao, X Zhou, X Xu, H Su, X Zhang… - arxiv preprint arxiv …, 2024 - arxiv.org
Designing generalizable agents capable of adapting to diverse embodiments has achieved
significant attention in Reinforcement Learning (RL), which is critical for deploying RL …

Dreaming of many worlds: Learning contextual world models aids zero-shot generalization

S Prasanna, K Farid, R Rajan… - arxiv preprint arxiv …, 2024 - arxiv.org
Zero-shot generalization (ZSG) to unseen dynamics is a major challenge for creating
generally capable embodied agents. To address the broader challenge, we start with the …

[PDF][PDF] Efficient offline meta-reinforcement learning via robust task representations and adaptive policy generation

Z Li, Z Lin, Y Chen, Z Liu - Proceedings of the Thirty-Third International …, 2024 - ijcai.org
Zero-shot adaptation is crucial for agents facing new tasks. Offline Meta-Reinforcement
Learning (OMRL), utilizing offline multi-task datasets to train policies, offers a way to attain …

On task-relevant loss functions in meta-reinforcement learning

J Shin, G Kim, H Lee, J Han… - 6th Annual Learning for …, 2024 - proceedings.mlr.press
Designing a competent meta-reinforcement learning (meta-RL) algorithm in terms of data
usage remains a central challenge to be tackled for its successful real-world applications. In …

Hierarchical Transformers are Efficient Meta-Reinforcement Learners

G Shala, A Biedenkapp, J Grabocka - arxiv preprint arxiv:2402.06402, 2024 - arxiv.org
We introduce Hierarchical Transformers for Meta-Reinforcement Learning (HTrMRL), a
powerful online meta-reinforcement learning approach. HTrMRL aims to address the …

GRAM: Generalization in Deep RL with a Robust Adaptation Module

J Queeney, X Cai, M Benosman, JP How - arxiv preprint arxiv:2412.04323, 2024 - arxiv.org
The reliable deployment of deep reinforcement learning in real-world settings requires the
ability to generalize across a variety of conditions, including both in-distribution scenarios …

Deep deterministic policy gradients with a self-adaptive reward mechanism for image retrieval

F Ahmad, X Zhang, Z Tang, F Sabah, M Azam… - The Journal of …, 2025 - Springer
Traditional image retrieval methods often face challenges in adapting to varying user
preferences and dynamic datasets. To address these limitations, this research introduces a …

Dynamics Generalisation with Behaviour Foundation Models

S Jeen, J Cullen - Workshop on Training Agents with Foundation …, 2024 - openreview.net
Reinforcement learning agents perform poorly when faced with unseen dynamics. Recent
work on Behaviour Foundation Models (BFMs) has produced agents capable of solving …

Inferring Behavior-Specific Context Improves Zero-Shot Generalization in Reinforcement Learning

TC Ndir, A Biedenkapp, N Awad - arxiv preprint arxiv:2404.09521, 2024 - arxiv.org
In this work, we address the challenge of zero-shot generalization (ZSG) in Reinforcement
Learning (RL), where agents must adapt to entirely novel environments without additional …

Reinforcing automated machine learning-bridging AutoML and reinforcement learning

T Eimer - 2024 - repo.uni-hannover.de
Reinforcement learning is a machine learning paradigm that allows learning through
interaction. It intertwines data collection and model training into a single problem statement …