Google Akademik

CM Wu, B Meder, E Schulz - Annual Review of Psychology, 2024 - annualreviews.org

Generalization, defined as applying limited experiences to novel situations, represents a
cornerstone of human intelligence. Our review traces the evolution and continuity of …

Kaydet Alıntı yap Alıntılanma sayısı: 4 İlgili makaleler 8 sürümün hepsi

[Free GPT-4]

[PDF] mlr.press

Is pessimism provably efficient for offline rl?

Y **, Z Yang, Z Wang - International Conference on …, 2021 - proceedings.mlr.press

We study offline reinforcement learning (RL), which aims to learn an optimal policy based on
a dataset collected a priori. Due to the lack of further interactions with the environment …

Kaydet Alıntı yap Alıntılanma sayısı: 450 İlgili makaleler 7 sürümün hepsi HTML olarak görüntüle

[Free GPT-4]

[PDF] arxiv.org

The statistical complexity of interactive decision making

DJ Foster, SM Kakade, J Qian, A Rakhlin - arxiv preprint arxiv:2112.13487, 2021 - arxiv.org

A fundamental challenge in interactive learning and decision making, ranging from bandit
problems to reinforcement learning, is to provide sample-efficient, adaptive learning …

Kaydet Alıntı yap Alıntılanma sayısı: 205 İlgili makaleler 6 sürümün hepsi HTML olarak görüntüle

[Free GPT-4]

[PDF] mlr.press

Bilinear classes: A structural framework for provable generalization in rl

S Du, S Kakade, J Lee, S Lovett… - International …, 2021 - proceedings.mlr.press

Abstract This work introduces Bilinear Classes, a new structural framework, which permit
generalization in reinforcement learning in a wide variety of settings through the use of …

Kaydet Alıntı yap Alıntılanma sayısı: 244 İlgili makaleler 8 sürümün hepsi HTML olarak görüntüle

[Free GPT-4]

[PDF] mlr.press

Nearly minimax optimal reinforcement learning for linear mixture markov decision processes

D Zhou, Q Gu, C Szepesvari - Conference on Learning …, 2021 - proceedings.mlr.press

We study reinforcement learning (RL) with linear function approximation where the
underlying transition probability kernel of the Markov decision process (MDP) is a linear …

Kaydet Alıntı yap Alıntılanma sayısı: 245 İlgili makaleler 7 sürümün hepsi HTML olarak görüntüle

[Free GPT-4]

[PDF] neurips.cc

Policy finetuning: Bridging sample-efficient offline and online reinforcement learning

T **e, N Jiang, H Wang, C **ong… - Advances in neural …, 2021 - proceedings.neurips.cc

Recent theoretical work studies sample-efficient reinforcement learning (RL) extensively in
two settings: learning interactively in the environment (online RL), or learning from an offline …

Kaydet Alıntı yap Alıntılanma sayısı: 183 İlgili makaleler 9 sürümün hepsi HTML olarak görüntüle

[Free GPT-4]

[PDF] mlr.press

Human-in-the-loop: Provably efficient preference-based reinforcement learning with general function approximation

X Chen, H Zhong, Z Yang, Z Wang… - … on Machine Learning, 2022 - proceedings.mlr.press

We study human-in-the-loop reinforcement learning (RL) with trajectory preferences, where
instead of receiving a numeric reward at each step, the RL agent only receives preferences …

Kaydet Alıntı yap Alıntılanma sayısı: 61 İlgili makaleler 5 sürümün hepsi HTML olarak görüntüle

[Free GPT-4]

[PDF] mlr.press

Guarantees for epsilon-greedy reinforcement learning with function approximation

C Dann, Y Mansour, M Mohri… - International …, 2022 - proceedings.mlr.press

Myopic exploration policies such as epsilon-greedy, softmax, or Gaussian noise fail to
explore efficiently in some reinforcement learning tasks and yet, they perform well in many …

Kaydet Alıntı yap Alıntılanma sayısı: 72 İlgili makaleler 6 sürümün hepsi HTML olarak görüntüle

[Free GPT-4]

[PDF] arxiv.org

The role of coverage in online reinforcement learning

T **e, DJ Foster, Y Bai, N Jiang, SM Kakade - arxiv preprint arxiv …, 2022 - arxiv.org

Coverage conditions--which assert that the data logging distribution adequately covers the
state space--play a fundamental role in determining the sample complexity of offline …

Kaydet Alıntı yap Alıntılanma sayısı: 76 İlgili makaleler 4 sürümün hepsi HTML olarak görüntüle

[Free GPT-4]

[PDF] neurips.cc

Corruption-robust offline reinforcement learning with general function approximation

C Ye, R Yang, Q Gu, T Zhang - Advances in Neural …, 2024 - proceedings.neurips.cc

We investigate the problem of corruption robustness in offline reinforcement learning (RL)
with general function approximation, where an adversary can corrupt each sample in the …

Kaydet Alıntı yap Alıntılanma sayısı: 19 İlgili makaleler 7 sürümün hepsi HTML olarak görüntüle

Uyarı oluştur

Alıntı yap

Gelişmiş arama

Kitaplığım'a kaydedildi

Reinforcement learning with general value function approximation: Provably efficient approach...

Unifying principles of generalization: past, present, and future

Is pessimism provably efficient for offline rl?

The statistical complexity of interactive decision making

Bilinear classes: A structural framework for provable generalization in rl

Nearly minimax optimal reinforcement learning for linear mixture markov decision processes

Policy finetuning: Bridging sample-efficient offline and online reinforcement learning

Human-in-the-loop: Provably efficient preference-based reinforcement learning with general function approximation

Guarantees for epsilon-greedy reinforcement learning with function approximation

The role of coverage in online reinforcement learning

Corruption-robust offline reinforcement learning with general function approximation