Vision-language models are zero-shot reward models for reinforcement learning
Reinforcement learning (RL) requires either manually specifying a reward function, which is
often infeasible, or learning a reward model from a large amount of human feedback, which …
often infeasible, or learning a reward model from a large amount of human feedback, which …
Ram: Retrieval-based affordance transfer for generalizable zero-shot robotic manipulation
This work proposes a retrieve-and-transfer framework for zero-shot robotic manipulation,
dubbed RAM, featuring generalizability across various objects, environments, and …
dubbed RAM, featuring generalizability across various objects, environments, and …
Active preference-based Gaussian process regression for reward learning and optimization
Designing reward functions is a difficult task in AI and robotics. The complex task of directly
specifying all the desirable behaviors a robot needs to optimize often proves challenging for …
specifying all the desirable behaviors a robot needs to optimize often proves challenging for …
Vision-language models as a source of rewards
Building generalist agents that can accomplish many goals in rich open-ended
environments is one of the research frontiers for reinforcement learning. A key limiting factor …
environments is one of the research frontiers for reinforcement learning. A key limiting factor …
Rl-vlm-f: Reinforcement learning from vision language foundation model feedback
Reward engineering has long been a challenge in Reinforcement Learning (RL) research,
as it often requires extensive human effort and iterative processes of trial-and-error to design …
as it often requires extensive human effort and iterative processes of trial-and-error to design …
Integrating reinforcement learning with foundation models for autonomous robotics: Methods and perspectives
Foundation models (FMs), large deep learning models pre-trained on vast, unlabeled
datasets, exhibit powerful capabilities in understanding complex patterns and generating …
datasets, exhibit powerful capabilities in understanding complex patterns and generating …
LLM-empowered state representation for reinforcement learning
Conventional state representations in reinforcement learning often omit critical task-related
details, presenting a significant challenge for value networks in establishing accurate …
details, presenting a significant challenge for value networks in establishing accurate …
Vision-language model-based human-robot collaboration for smart manufacturing: A state-of-the-art survey
Abstract human-robot collaboration (HRC) is set to transform the manufacturing paradigm by
leveraging the strengths of human flexibility and robot precision. The recent breakthrough of …
leveraging the strengths of human flexibility and robot precision. The recent breakthrough of …
Evolution and Prospects of Foundation Models: From Large Language Models to Large Multimodal Models.
Z Chen, L Xu, H Zheng, L Chen… - Computers …, 2024 - search.ebscohost.com
Since the 1950s, when the Turing Test was introduced, there has been notable progress in
machine language intelligence. Language modeling, crucial for AI development, has …
machine language intelligence. Language modeling, crucial for AI development, has …
Epo: Hierarchical llm agents with environment preference optimization
Long-horizon decision-making tasks present significant challenges for LLM-based agents
due to the need for extensive planning over multiple steps. In this paper, we propose a …
due to the need for extensive planning over multiple steps. In this paper, we propose a …