Google Académico

C Kim, Y Seo, H Liu, L Lee, J Shin… - Advances in Neural …, 2024 - proceedings.neurips.cc

Develo** an agent capable of adapting to unseen environments remains a difficult
challenge in imitation learning. This work presents Adaptive Return-conditioned Policy …

Guardar Citar Citado por 9 Artículos relacionados Las 9 versiones Versión en HTML

[Free GPT-4]

[PDF] arxiv.org

Mastering robot manipulation with multimodal prompts through pretraining and multi-task fine-tuning

J Li, Q Gao, M Johnston, X Gao, X He… - arxiv preprint arxiv …, 2023 - arxiv.org

Prompt-based learning has been demonstrated as a compelling paradigm contributing to
large language models' tremendous success (LLMs). Inspired by their success in language …

Guardar Citar Citado por 7 Artículos relacionados Las 4 versiones Versión en HTML

[Free GPT-4]

[PDF] mlr.press

Reward-Relevance-Filtered Linear Offline Reinforcement Learning

A Zhou - … Conference on Artificial Intelligence and Statistics, 2024 - proceedings.mlr.press

This paper studies offline reinforcement learning with linear function approximation in a
setting with decision-theoretic, but not estimation sparsity. The structural restrictions of the …

Guardar Citar Citado por 2 Artículos relacionados Las 4 versiones Versión en HTML

Masked and Inverse Dynamics Modeling for Data-Efficient Reinforcement Learning

YJ Lee, J Kim, YJ Park, M Kwak… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org

In pixel-based deep reinforcement learning (DRL), learning representations of states that
change because of an agent's action or interaction with the environment poses a critical …

Guardar Citar Citado por 1 Artículos relacionados Las 3 versiones

[Free GPT-4]

[PDF] arxiv.org

Learning Abstract World Model for Value-preserving Planning with Options

R Rodriguez-Sanchez, G Konidaris - arxiv preprint arxiv:2406.15850, 2024 - arxiv.org

General-purpose agents require fine-grained controls and rich sensory inputs to perform a
wide range of tasks. However, this complexity often leads to intractable decision-making …

Guardar Citar Citado por 3 Artículos relacionados Las 2 versiones Versión en HTML

[Free GPT-4]

[PDF] mdpi.com

Exact and soft successive refinement of the information bottleneck

H Charvin, N Catenacci Volpi, D Polani - Entropy, 2023 - mdpi.com

The information bottleneck (IB) framework formalises the essential requirement for efficient
information processing systems to achieve an optimal balance between the complexity of …

Guardar Citar Citado por 1 Artículos relacionados Las 9 versiones En caché

[Free GPT-4]

[PDF] arxiv.org

Reinforcement Learning under Latent Dynamics: Toward Statistical and Algorithmic Modularity

P Amortila, DJ Foster, N Jiang, A Krishnamurthy… - arxiv preprint arxiv …, 2024 - arxiv.org

Real-world applications of reinforcement learning often involve environments where agents
operate on complex, high-dimensional observations, but the underlying (''latent'') dynamics …

Guardar Citar Artículos relacionados Las 3 versiones Versión en HTML

[Free GPT-4]

[PDF] arxiv.org

Policy-shaped prediction: avoiding distractions in model-based reinforcement learning

M Hutson, I Kauvar, N Haber - arxiv preprint arxiv:2412.05766, 2024 - arxiv.org

Model-based reinforcement learning (MBRL) is a promising route to sample-efficient policy
optimization. However, a known vulnerability of reconstruction-based MBRL consists of …

Guardar Citar Artículos relacionados Las 4 versiones Versión en HTML

[Free GPT-4]

[PDF] arxiv.org

Learning Fused State Representations for Control from Multi-View Observations

Z Wang, YH Li, X Li, H Zang, R Laroche… - arxiv preprint arxiv …, 2025 - arxiv.org

Multi-View Reinforcement Learning (MVRL) seeks to provide agents with multi-view
observations, enabling them to perceive environment with greater effectiveness and …

Guardar Citar Artículos relacionados Las 3 versiones Versión en HTML

[Free GPT-4]

[PDF] arxiv.org

Learning Cognitive Maps from Transformer Representations for Efficient Planning in Partially Observed Environments

A Dedieu, W Lehrach, G Zhou, D George… - arxiv preprint arxiv …, 2024 - arxiv.org

Despite their stellar performance on a wide range of tasks, including in-context tasks only
revealed during inference, vanilla transformers and variants trained for next-token …

Guardar Citar Citado por 2 Artículos relacionados Las 3 versiones Versión en HTML

Citar

Búsqueda avanzada

Guardado en Mi biblioteca

Guide your agent with adaptive multimodal rewards

Mastering robot manipulation with multimodal prompts through pretraining and multi-task fine-tuning

Reward-Relevance-Filtered Linear Offline Reinforcement Learning

Masked and Inverse Dynamics Modeling for Data-Efficient Reinforcement Learning

Learning Abstract World Model for Value-preserving Planning with Options

Exact and soft successive refinement of the information bottleneck

Reinforcement Learning under Latent Dynamics: Toward Statistical and Algorithmic Modularity

Policy-shaped prediction: avoiding distractions in model-based reinforcement learning

Learning Fused State Representations for Control from Multi-View Observations

Learning Cognitive Maps from Transformer Representations for Efficient Planning in Partially Observed Environments