Deep generative models for offline policy learning: Tutorial, survey, and perspectives on future directions

J Chen, B Ganguly, Y Xu, Y Mei, T Lan… - arxiv preprint arxiv …, 2024 - arxiv.org
Deep generative models (DGMs) have demonstrated great success across various domains,
particularly in generating texts, images, and videos using models trained from offline data …

Functional Graphical Models: Structure Enables Offline Data-Driven Optimization

K Grudzien, M Uehara, S Levine… - International …, 2024 - proceedings.mlr.press
While machine learning models are typically trained to solve prediction problems, we might
often want to use them for optimization problems. For example, given a dataset of proteins …

Latent energy-based odyssey: Black-box optimization via expanded exploration in the energy-based latent space

P Yu, D Zhang, H He, X Ma, R Miao, Y Lu… - arxiv preprint arxiv …, 2024 - arxiv.org
Offline Black-Box Optimization (BBO) aims at optimizing a black-box function using the
knowledge from a pre-collected offline dataset of function values and corresponding input …

Latent Conservative Objective Models for Data-Driven Crystal Structure Prediction

H Qi, X Geng, S Rando, I Ohama, A Kumar… - arxiv preprint arxiv …, 2023 - arxiv.org
In computational chemistry, crystal structure prediction (CSP) is an optimization problem that
involves discovering the lowest energy stable crystal structure for a given chemical formula …

Sharpness-Aware Black-Box Optimization

F Ye, Y Lyu, X Wang, M Sugiyama, Y Zhang… - arxiv preprint arxiv …, 2024 - arxiv.org
Black-box optimization algorithms have been widely used in various machine learning
problems, including reinforcement learning and prompt fine-tuning. However, directly …

Offline Model-Based Optimization by Learning to Rank

RX Tan, K Xue, SH Lyu, H Shang, Y Wang… - arxiv preprint arxiv …, 2024 - arxiv.org
Offline model-based optimization (MBO) aims to identify a design that maximizes a black-
box function using only a fixed, pre-collected dataset of designs and their corresponding …

Out-of-Distribution Adaptation in Offline RL: Counterfactual Reasoning via Causal Normalizing Flows

M Cho, JP How, C Sun - arxiv preprint arxiv:2405.03892, 2024 - arxiv.org
Despite notable successes of Reinforcement Learning (RL), the prevalent use of an online
learning paradigm prevents its widespread adoption, especially in hazardous or costly …

Latent conservative objective models for offline data-driven crystal structure prediction

H Qi, S Rando, X Geng, I Ohama, A Kumar, S Levine - 2023 - openreview.net
In computational chemistry, crystal structure prediction (CSP) is an optimization problem that
involves discovering the lowest energy stable crystal structure for a given chemical formula …

When is Offline Policy Selection Sample Efficient for Reinforcement Learning?

V Liu, P Nagarajan, A Patterson, M White - arxiv preprint arxiv …, 2023 - arxiv.org
Offline reinforcement learning algorithms often require careful hyperparameter tuning.
Consequently, before deployment, we need to select amongst a set of candidate policies. As …

ROMO: Retrieval-enhanced Offline Model-based Optimization

M Chen, H Zhao, Y Zhao, H Fan, H Gao, Y Yu… - Proceedings of the Fifth …, 2023 - dl.acm.org
Data-driven black-box model-based optimization (MBO) problems arise in a great number of
practical application scenarios, where the goal is to find a design over the whole space …