Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Training language models to follow instructions with human feedback
Making language models bigger does not inherently make them better at following a user's
intent. For example, large language models can generate outputs that are untruthful, toxic, or …
intent. For example, large language models can generate outputs that are untruthful, toxic, or …
Perceiver io: A general architecture for structured inputs & outputs
A central goal of machine learning is the development of systems that can solve many
problems in as many data domains as possible. Current architectures, however, cannot be …
problems in as many data domains as possible. Current architectures, however, cannot be …
Decision transformer: Reinforcement learning via sequence modeling
We introduce a framework that abstracts Reinforcement Learning (RL) as a sequence
modeling problem. This allows us to draw upon the simplicity and scalability of the …
modeling problem. This allows us to draw upon the simplicity and scalability of the …
Frozen pretrained transformers as universal computation engines
We investigate the capability of a transformer pretrained on natural language to generalize
to other modalities with minimal finetuning--in particular, without finetuning of the self …
to other modalities with minimal finetuning--in particular, without finetuning of the self …
Language-conditioned learning for robotic manipulation: A survey
Language-conditioned robotic manipulation represents a cutting-edge area of research,
enabling seamless communication and cooperation between humans and robotic agents …
enabling seamless communication and cooperation between humans and robotic agents …
Collaborating with humans without human data
Collaborating with humans requires rapidly adapting to their individual strengths,
weaknesses, and preferences. Unfortunately, most standard multi-agent reinforcement …
weaknesses, and preferences. Unfortunately, most standard multi-agent reinforcement …
Natural language instructions induce compositional generalization in networks of neurons
R Riveland, A Pouget - Nature Neuroscience, 2024 - nature.com
A fundamental human cognitive feat is to interpret linguistic instructions in order to perform
novel tasks without explicit task experience. Yet, the neural computations that might be used …
novel tasks without explicit task experience. Yet, the neural computations that might be used …
Vision-language models as success detectors
Detecting successful behaviour is crucial for training intelligent agents. As such,
generalisable reward models are a prerequisite for agents that can learn to generalise their …
generalisable reward models are a prerequisite for agents that can learn to generalise their …
Habitat-web: Learning embodied object-search strategies from human demonstrations at scale
R Ramrakhya, E Undersander… - Proceedings of the …, 2022 - openaccess.thecvf.com
We present a large-scale study of imitating human demonstrations on tasks that require a
virtual robot to search for objects in new environments-(1) ObjectGoal Navigation (eg'find & …
virtual robot to search for objects in new environments-(1) ObjectGoal Navigation (eg'find & …
Pirlnav: Pretraining with imitation and rl finetuning for objectnav
Abstract We study ObjectGoal Navigation--where a virtual robot situated in a new
environment is asked to navigate to an object. Prior work has shown that imitation learning …
environment is asked to navigate to an object. Prior work has shown that imitation learning …