Fine-tuning large vision-language models as decision-making agents via reinforcement learning
Large vision-language models (VLMs) fine-tuned on specialized visual instruction-following
data have exhibited impressive language reasoning capabilities across various scenarios …
data have exhibited impressive language reasoning capabilities across various scenarios …
Object goal navigation using goal-oriented semantic exploration
This work studies the problem of object goal navigation which involves navigating to an
instance of the given object category in unseen environments. End-to-end learning-based …
instance of the given object category in unseen environments. End-to-end learning-based …
Evolving curricula with regret-based environment design
Training generally-capable agents with reinforcement learning (RL) remains a significant
challenge. A promising avenue for improving the robustness of RL agents is through the use …
challenge. A promising avenue for improving the robustness of RL agents is through the use …
Embodied intelligence via learning and evolution
The intertwined processes of learning and evolution in complex environmental niches have
resulted in a remarkable diversity of morphological forms. Moreover, many aspects of animal …
resulted in a remarkable diversity of morphological forms. Moreover, many aspects of animal …
Learning to explore using active neural slam
This work presents a modular and hierarchical approach to learn policies for exploring 3D
environments, calledActive Neural SLAM'. Our approach leverages the strengths of both …
environments, calledActive Neural SLAM'. Our approach leverages the strengths of both …
Character controllers using motion vaes
HY Ling, F Zinno, G Cheng… - ACM Transactions on …, 2020 - dl.acm.org
A fundamental problem in computer animation is that of realizing purposeful and realistic
human movement given a sufficiently-rich set of motion capture clips. We learn data-driven …
human movement given a sufficiently-rich set of motion capture clips. We learn data-driven …
Recurrent independent mechanisms
Learning modular structures which reflect the dynamics of the environment can lead to better
generalization and robustness to changes which only affect a few of the underlying causes …
generalization and robustness to changes which only affect a few of the underlying causes …
Reward constrained policy optimization
Solving tasks in Reinforcement Learning is no easy feat. As the goal of the agent is to
maximize the accumulated reward, it often learns to exploit loopholes and misspecifications …
maximize the accumulated reward, it often learns to exploit loopholes and misspecifications …
Unsupervised state representation learning in atari
State representation learning, or the ability to capture latent generative factors of an
environment is crucial for building intelligent agents that can perform a wide variety of tasks …
environment is crucial for building intelligent agents that can perform a wide variety of tasks …
Why generalization in rl is difficult: Epistemic pomdps and implicit partial observability
Generalization is a central challenge for the deployment of reinforcement learning (RL)
systems in the real world. In this paper, we show that the sequential structure of the RL …
systems in the real world. In this paper, we show that the sequential structure of the RL …