Motion planning for autonomous driving: The state of the art and future perspectives
Intelligent vehicles (IVs) have gained worldwide attention due to their increased
convenience, safety advantages, and potential commercial value. Despite predictions of …
convenience, safety advantages, and potential commercial value. Despite predictions of …
Knowledge-integrated machine learning for materials: lessons from gameplaying and robotics
As materials researchers increasingly embrace machine-learning (ML) methods, it is natural
to wonder what lessons can be learned from other fields undergoing similar developments …
to wonder what lessons can be learned from other fields undergoing similar developments …
Open problems and fundamental limitations of reinforcement learning from human feedback
Reinforcement learning from human feedback (RLHF) is a technique for training AI systems
to align with human goals. RLHF has emerged as the central method used to finetune state …
to align with human goals. RLHF has emerged as the central method used to finetune state …
Principled reinforcement learning with human feedback from pairwise or k-wise comparisons
We provide a theoretical framework for Reinforcement Learning with Human Feedback
(RLHF). We show that when the underlying true reward is linear, under both Bradley-Terry …
(RLHF). We show that when the underlying true reward is linear, under both Bradley-Terry …
Video pretraining (vpt): Learning to act by watching unlabeled online videos
Pretraining on noisy, internet-scale datasets has been heavily studied as a technique for
training models with broad, general capabilities for text, images, and other modalities …
training models with broad, general capabilities for text, images, and other modalities …
On the opportunities and risks of foundation models
AI is undergoing a paradigm shift with the rise of models (eg, BERT, DALL-E, GPT-3) that are
trained on broad data at scale and are adaptable to a wide range of downstream tasks. We …
trained on broad data at scale and are adaptable to a wide range of downstream tasks. We …
A survey of zero-shot generalisation in deep reinforcement learning
The study of zero-shot generalisation (ZSG) in deep Reinforcement Learning (RL) aims to
produce RL algorithms whose policies generalise well to novel unseen situations at …
produce RL algorithms whose policies generalise well to novel unseen situations at …
Defining and characterizing reward gaming
We provide the first formal definition of\textbf {reward hacking}, a phenomenon where
optimizing an imperfect proxy reward function, $\mathcal {\tilde {R}} $, leads to poor …
optimizing an imperfect proxy reward function, $\mathcal {\tilde {R}} $, leads to poor …
Deep reinforcement learning in smart manufacturing: A review and prospects
To facilitate the personalized smart manufacturing paradigm with cognitive automation
capabilities, Deep Reinforcement Learning (DRL) has attracted ever-increasing attention by …
capabilities, Deep Reinforcement Learning (DRL) has attracted ever-increasing attention by …
Reinforcement learning algorithms: A brief survey
Reinforcement Learning (RL) is a machine learning (ML) technique to learn sequential
decision-making in complex problems. RL is inspired by trial-and-error based human/animal …
decision-making in complex problems. RL is inspired by trial-and-error based human/animal …