- Academic Search

S Du, S Kakade, J Lee, S Lovett… - International …, 2021 - proceedings.mlr.press

Abstract This work introduces Bilinear Classes, a new structural framework, which permit
generalization in reinforcement learning in a wide variety of settings through the use of …

Zapisz Cytuj Cytowane przez 245 Powiązane artykuły Wszystkie wersje 8 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Representation learning for online and offline rl in low-rank mdps

M Uehara, X Zhang, W Sun - ar** for uncertainty-driven offline reinforcement learning

C Bai, L Wang, Z Yang, Z Deng, A Garg, P Liu… - arxiv preprint arxiv …, 2022 - arxiv.org

Offline Reinforcement Learning (RL) aims to learn policies from previously collected
datasets without exploring the environment. Directly applying off-policy algorithms to offline …

Zapisz Cytuj Cytowane przez 163 Powiązane artykuły Wszystkie wersje 5 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

On information gain and regret bounds in gaussian process bandits

S Vakili, K Khezeli, V Picheny - International Conference on …, 2021 - proceedings.mlr.press

Consider the sequential optimization of an expensive to evaluate and possibly non-convex
objective function $ f $ from noisy feedback, that can be considered as a continuum-armed …

Zapisz Cytuj Cytowane przez 152 Powiązane artykuły Wszystkie wersje 4 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Mitigating covariate shift in imitation learning via offline data with partial coverage

J Chang, M Uehara, D Sreenivas… - Advances in Neural …, 2021 - proceedings.neurips.cc

This paper studies offline Imitation Learning (IL) where an agent learns to imitate an expert
demonstrator without additional online environment interactions. Instead, the learner is …

Zapisz Cytuj Cytowane przez 104 Powiązane artykuły Wszystkie wersje 7 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Distributionally robust model-based reinforcement learning with large state spaces

SS Ramesh, PG Sessa, Y Hu… - International …, 2024 - proceedings.mlr.press

Three major challenges in reinforcement learning are the complex dynamical systems with
large state spaces, the costly data acquisition processes, and the deviation of real-world …

Zapisz Cytuj Cytowane przez 10 Powiązane artykuły Wszystkie wersje 4 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Model-based rl with optimistic posterior sampling: Structural conditions and sample complexity

A Agarwal, T Zhang - Advances in Neural Information …, 2022 - proceedings.neurips.cc

We propose a general framework to design posterior sampling methods for model-based
RL. We show that the proposed algorithms can be analyzed by reducing regret to Hellinger …

Zapisz Cytuj Cytowane przez 37 Powiązane artykuły Wszystkie wersje 9 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

On function approximation in reinforcement learning: Optimism in the face of large state spaces

Z Yang, C **, Z Wang, M Wang, MI Jordan - arxiv preprint arxiv …, 2020 - arxiv.org

The classical theory of reinforcement learning (RL) has focused on tabular and linear
representations of value functions. Further progress hinges on combining RL with modern …

Zapisz Cytuj Cytowane przez 89 Powiązane artykuły Wszystkie wersje 6 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Optimal exploration for model-based rl in nonlinear systems

A Wagenmaker, G Shi… - Advances in Neural …, 2023 - proceedings.neurips.cc

Learning to control unknown nonlinear dynamical systems is a fundamental problem in
reinforcement learning and control theory. A commonly applied approach is to first explore …

Zapisz Cytuj Cytowane przez 19 Powiązane artykuły Wszystkie wersje 7 Wersja HTML

Utwórz alert

Cytuj

Szukanie zaawansowane

Zapisano w Mojej bibliotece

Information theoretic regret bounds for online nonlinear control

Bilinear classes: A structural framework for provable generalization in rl

Representation learning for online and offline rl in low-rank mdps

On information gain and regret bounds in gaussian process bandits

Mitigating covariate shift in imitation learning via offline data with partial coverage

Distributionally robust model-based reinforcement learning with large state spaces

Model-based rl with optimistic posterior sampling: Structural conditions and sample complexity

On function approximation in reinforcement learning: Optimism in the face of large state spaces

Optimal exploration for model-based rl in nonlinear systems