Optimal conservative offline rl with general function approximation via augmented lagrangian

P Rashidinejad, H Zhu, K Yang, S Russell… - arxiv preprint arxiv …, 2022 - arxiv.org
Offline reinforcement learning (RL), which refers to decision-making from a previously-
collected dataset of interactions, has received significant attention over the past years. Much …

An effective negotiating agent framework based on deep offline reinforcement learning

S Chen, J Zhao, G Weiss, R Su… - Uncertainty in Artificial …, 2023 - proceedings.mlr.press
Learning is crucial for automated negotiation, and recent years have witnessed a
remarkable achievement in application of reinforcement learning (RL) for various …

When demonstrations meet generative world models: A maximum likelihood framework for offline inverse reinforcement learning

S Zeng, C Li, A Garcia, M Hong - Advances in Neural …, 2023 - proceedings.neurips.cc
Offline inverse reinforcement learning (Offline IRL) aims to recover the structure of rewards
and environment dynamics that underlie observed actions in a fixed, finite set of …

Understanding expertise through demonstrations: A maximum likelihood framework for offline inverse reinforcement learning

S Zeng, C Li, A Garcia, M Hong - arxiv preprint arxiv:2302.07457, 2023 - arxiv.org
Offline inverse reinforcement learning (Offline IRL) aims to recover the structure of rewards
and environment dynamics that underlie observed actions in a fixed, finite set of …

SCORE: Simple Contrastive Representation and Reset-Ensemble for offline meta-reinforcement learning

H Yang, K Lin, T Yang, G Sun - Knowledge-Based Systems, 2025 - Elsevier
Offline meta-reinforcement learning (OMRL) aims to train agents to quickly adapt to new
tasks using only pre-collected data. However, existing OMRL methods often involve …

A simple unified uncertainty-guided framework for offline-to-online reinforcement learning

S Guo, Y Sun, J Hu, S Huang, H Chen, H Piao… - arxiv preprint arxiv …, 2023 - arxiv.org
Offline reinforcement learning (RL) provides a promising solution to learning an agent fully
relying on a data-driven paradigm. However, constrained by the limited quality of the offline …

Representation-driven reinforcement learning

O Nabati, G Tennenholtz… - … Conference on Machine …, 2023 - proceedings.mlr.press
We present a representation-driven framework for reinforcement learning. By representing
policies as estimates of their expected values, we leverage techniques from contextual …

Delphic offline reinforcement learning under nonidentifiable hidden confounding

A Pace, H Yèche, B Schölkopf, G Rätsch… - arxiv preprint arxiv …, 2023 - arxiv.org
A prominent challenge of offline reinforcement learning (RL) is the issue of hidden
confounding: unobserved variables may influence both the actions taken by the agent and …

Sumo: Search-based uncertainty estimation for model-based offline reinforcement learning

Z Qiao, J Lyu, K Jiao, Q Liu, X Li - arxiv preprint arxiv:2408.12970, 2024 - arxiv.org
The performance of offline reinforcement learning (RL) suffers from the limited size and
quality of static datasets. Model-based offline RL addresses this issue by generating …

Embedding-Aligned Language Models

G Tennenholtz, Y Chow, CW Hsu, L Shani… - arxiv preprint arxiv …, 2024 - arxiv.org
We propose a novel approach for training large language models (LLMs) to adhere to
objectives defined within a latent embedding space. Our method leverages reinforcement …