Generative models as an emerging paradigm in the chemical sciences

DM Anstine, O Isayev - Journal of the American Chemical Society, 2023 - ACS Publications
Traditional computational approaches to design chemical species are limited by the need to
compute properties for a vast number of candidates, eg, by discriminative modeling …

Safe learning in robotics: From learning-based control to safe reinforcement learning

L Brunke, M Greeff, AW Hall, Z Yuan… - Annual Review of …, 2022 - annualreviews.org
The last half decade has seen a steep rise in the number of contributions on safe learning
methods for real-world robotic deployments from both the control and reinforcement learning …

Open problems and fundamental limitations of reinforcement learning from human feedback

S Casper, X Davies, C Shi, TK Gilbert… - arxiv preprint arxiv …, 2023 - arxiv.org
Reinforcement learning from human feedback (RLHF) is a technique for training AI systems
to align with human goals. RLHF has emerged as the central method used to finetune state …

Eureka: Human-level reward design via coding large language models

YJ Ma, W Liang, G Wang, DA Huang, O Bastani… - arxiv preprint arxiv …, 2023 - arxiv.org
Large Language Models (LLMs) have excelled as high-level semantic planners for
sequential decision-making tasks. However, harnessing them to learn complex low-level …

Rlprompt: Optimizing discrete text prompts with reinforcement learning

M Deng, J Wang, CP Hsieh, Y Wang, H Guo… - arxiv preprint arxiv …, 2022 - arxiv.org
Prompting has shown impressive success in enabling large pretrained language models
(LMs) to perform diverse NLP tasks, especially when only few downstream data are …

Stable-baselines3: Reliable reinforcement learning implementations

A Raffin, A Hill, A Gleave, A Kanervisto… - Journal of Machine …, 2021 - jmlr.org
STABLE-BASELINES3 provides open-source implementations of deep reinforcement
learning (RL) algorithms in Python. The implementations have been benchmarked against …

Deep reinforcement learning at the edge of the statistical precipice

R Agarwal, M Schwarzer, PS Castro… - Advances in neural …, 2021 - proceedings.neurips.cc
Deep reinforcement learning (RL) algorithms are predominantly evaluated by comparing
their relative performance on a large suite of tasks. Most published results on deep RL …

A survey of zero-shot generalisation in deep reinforcement learning

R Kirk, A Zhang, E Grefenstette, T Rocktäschel - Journal of Artificial …, 2023 - jair.org
The study of zero-shot generalisation (ZSG) in deep Reinforcement Learning (RL) aims to
produce RL algorithms whose policies generalise well to novel unseen situations at …

A minimalist approach to offline reinforcement learning

S Fujimoto, SS Gu - Advances in neural information …, 2021 - proceedings.neurips.cc
Offline reinforcement learning (RL) defines the task of learning from a fixed batch of data.
Due to errors in value estimation from out-of-distribution actions, most offline RL algorithms …

Deep reinforcement learning in smart manufacturing: A review and prospects

C Li, P Zheng, Y Yin, B Wang, L Wang - CIRP Journal of Manufacturing …, 2023 - Elsevier
To facilitate the personalized smart manufacturing paradigm with cognitive automation
capabilities, Deep Reinforcement Learning (DRL) has attracted ever-increasing attention by …