Guide your agent with adaptive multimodal rewards

C Kim, Y Seo, H Liu, L Lee, J Shin… - Advances in Neural …, 2024 - proceedings.neurips.cc
Develo** an agent capable of adapting to unseen environments remains a difficult
challenge in imitation learning. This work presents Adaptive Return-conditioned Policy …

Mastering robot manipulation with multimodal prompts through pretraining and multi-task fine-tuning

J Li, Q Gao, M Johnston, X Gao, X He… - arxiv preprint arxiv …, 2023 - arxiv.org
Prompt-based learning has been demonstrated as a compelling paradigm contributing to
large language models' tremendous success (LLMs). Inspired by their success in language …

Reward-Relevance-Filtered Linear Offline Reinforcement Learning

A Zhou - … Conference on Artificial Intelligence and Statistics, 2024 - proceedings.mlr.press
This paper studies offline reinforcement learning with linear function approximation in a
setting with decision-theoretic, but not estimation sparsity. The structural restrictions of the …

Masked and Inverse Dynamics Modeling for Data-Efficient Reinforcement Learning

YJ Lee, J Kim, YJ Park, M Kwak… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
In pixel-based deep reinforcement learning (DRL), learning representations of states that
change because of an agent's action or interaction with the environment poses a critical …

Learning Abstract World Model for Value-preserving Planning with Options

R Rodriguez-Sanchez, G Konidaris - arxiv preprint arxiv:2406.15850, 2024 - arxiv.org
General-purpose agents require fine-grained controls and rich sensory inputs to perform a
wide range of tasks. However, this complexity often leads to intractable decision-making …

Exact and soft successive refinement of the information bottleneck

H Charvin, N Catenacci Volpi, D Polani - Entropy, 2023 - mdpi.com
The information bottleneck (IB) framework formalises the essential requirement for efficient
information processing systems to achieve an optimal balance between the complexity of …

Reinforcement Learning under Latent Dynamics: Toward Statistical and Algorithmic Modularity

P Amortila, DJ Foster, N Jiang, A Krishnamurthy… - arxiv preprint arxiv …, 2024 - arxiv.org
Real-world applications of reinforcement learning often involve environments where agents
operate on complex, high-dimensional observations, but the underlying (''latent'') dynamics …

Policy-shaped prediction: avoiding distractions in model-based reinforcement learning

M Hutson, I Kauvar, N Haber - arxiv preprint arxiv:2412.05766, 2024 - arxiv.org
Model-based reinforcement learning (MBRL) is a promising route to sample-efficient policy
optimization. However, a known vulnerability of reconstruction-based MBRL consists of …

Learning Fused State Representations for Control from Multi-View Observations

Z Wang, YH Li, X Li, H Zang, R Laroche… - arxiv preprint arxiv …, 2025 - arxiv.org
Multi-View Reinforcement Learning (MVRL) seeks to provide agents with multi-view
observations, enabling them to perceive environment with greater effectiveness and …

Learning Cognitive Maps from Transformer Representations for Efficient Planning in Partially Observed Environments

A Dedieu, W Lehrach, G Zhou, D George… - arxiv preprint arxiv …, 2024 - arxiv.org
Despite their stellar performance on a wide range of tasks, including in-context tasks only
revealed during inference, vanilla transformers and variants trained for next-token …