Representation learning for online and offline rl in low-rank mdps

M Uehara, X Zhang, W Sun - arxiv preprint arxiv:2110.04652, 2021 - arxiv.org
This work studies the question of Representation Learning in RL: how can we learn a
compact low-dimensional representation such that on top of the representation we can …

Spectral entry-wise matrix estimation for low-rank reinforcement learning

S Stojanovic, Y Jedra… - Advances in Neural …, 2023 - proceedings.neurips.cc
We study matrix estimation problems arising in reinforcement learning with low-rank
structure. In low-rank bandits, the matrix to be recovered specifies the expected arm …

Tackling combinatorial distribution shift: A matrix completion perspective

M Simchowitz, A Gupta… - The Thirty Sixth Annual …, 2023 - proceedings.mlr.press
Obtaining rigorous statistical guarantees for generalization under distribution shift remains
an open and active research area. We study a setting we call\emph {combinatorial …

Adaptive discretization in online reinforcement learning

SR Sinclair, S Banerjee, CL Yu - Operations Research, 2023 - pubsonline.informs.org
Discretization-based approaches to solving online reinforcement learning problems are
studied extensively on applications such as resource allocation and cache management …

Learning to extrapolate: A transductive approach

A Netanyahu, A Gupta, M Simchowitz, K Zhang… - arxiv preprint arxiv …, 2023 - arxiv.org
Machine learning systems, especially with overparameterized deep neural networks, can
generalize to novel test instances drawn from the same distribution as the training data …

Overcoming the long horizon barrier for sample-efficient reinforcement learning with latent low-rank structure

T Sam, Y Chen, CL Yu - Proceedings of the ACM on Measurement and …, 2023 - dl.acm.org
The practicality of reinforcement learning algorithms has been limited due to poor scaling
with respect to the problem size, as the sample complexity of learning an ε-optimal policy is …

Nearly optimal latent state decoding in block mdps

Y Jedra, J Lee, A Proutiere… - … Conference on Artificial …, 2023 - proceedings.mlr.press
We consider the problem of model estimation in episodic Block MDPs. In these MDPs, the
decision maker has access to rich observations or contexts generated from a small number …

Persim: Data-efficient offline reinforcement learning with heterogeneous agents via personalized simulators

A Agarwal, A Alomar, V Alumootil… - Advances in …, 2021 - proceedings.neurips.cc
We consider offline reinforcement learning (RL) with heterogeneous agents under severe
data scarcity, ie, we only observe a single historical trajectory for every agent under an …

Agnostic reinforcement learning with low-rank MDPs and rich observations

A Sekhari, C Dann, M Mohri… - Advances in Neural …, 2021 - proceedings.neurips.cc
There have been many recent advances on provably efficient Reinforcement Learning (RL)
in problems with rich observation spaces. However, all these works share a strong …

From Self-Attention to Markov Models: Unveiling the Dynamics of Generative Transformers

ME Ildiz, Y Huang, Y Li, AS Rawat, S Oymak - arxiv preprint arxiv …, 2024 - arxiv.org
Modern language models rely on the transformer architecture and attention mechanism to
perform language understanding and text generation. In this work, we study learning a 1 …