Google Академик

K Khetarpal, M Riemer, I Rish, D Precup - Journal of Artificial Intelligence …, 2022 - jair.org

In this article, we aim to provide a literature review of different formulations and approaches
to continual reinforcement learning (RL), also known as lifelong or non-stationary RL. We …

Сачувај Цитирај 355 пута наведен Сродни чланци Све верзије (10) HTML верзија

[Free GPT-4]
[DeepSeek]

[HTML] sciencedirect.com

[HTML][HTML] Review of online learning for control and diagnostics of power converters and drives: Algorithms, implementations and applications

M Zhang, PI Gómez, Q Xu, T Dragicevic - Renewable and Sustainable …, 2023 - Elsevier

Power converters and motor drives are playing a significant role in the transition towards
sustainable energy systems and transportation electrification. In this context, rich diversity of …

Сачувај Цитирај 16 пута наведен Сродни чланци Све верзије (6)

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Adversarially trained actor critic for offline reinforcement learning

CA Cheng, T ** for uncertainty-driven offline reinforcement learning

C Bai, L Wang, Z Yang, Z Deng, A Garg, P Liu… - arxiv preprint arxiv …, 2022 - arxiv.org

Offline Reinforcement Learning (RL) aims to learn policies from previously collected
datasets without exploring the environment. Directly applying off-policy algorithms to offline …

Сачувај Цитирај 168 пута наведен Сродни чланци Све верзије (5) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Pessimistic q-learning for offline reinforcement learning: Towards optimal sample complexity

L Shi, G Li, Y Wei, Y Chen… - … conference on machine …, 2022 - proceedings.mlr.press

Offline or batch reinforcement learning seeks to learn a near-optimal policy using history
data without active exploration of the environment. To counter the insufficient coverage and …

Сачувај Цитирај 113 пута наведен Сродни чланци Све верзије (10) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Hybrid rl: Using both offline and online data can make rl efficient

Y Song, Y Zhou, A Sekhari, JA Bagnell… - arxiv preprint arxiv …, 2022 - arxiv.org

We consider a hybrid reinforcement learning setting (Hybrid RL), in which an agent has
access to an offline dataset and the ability to collect experience via real-world online …

Сачувај Цитирај 93 пута наведен Сродни чланци Све верзије (5) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Pessimistic model-based offline reinforcement learning under partial coverage

M Uehara, W Sun - arxiv preprint arxiv:2107.06226, 2021 - arxiv.org

We study model-based offline Reinforcement Learning with general function approximation
without a full coverage assumption on the offline data distribution. We present an algorithm …

Сачувај Цитирај 163 пута наведен Сродни чланци Све верзије (5) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Settling the sample complexity of model-based offline reinforcement learning

G Li, L Shi, Y Chen, Y Chi, Y Wei - The Annals of Statistics, 2024 - projecteuclid.org

Settling the sample complexity of model-based offline reinforcement learning Page 1 The
Annals of Statistics 2024, Vol. 52, No. 1, 233–260 https://doi.org/10.1214/23-AOS2342 © …

Сачувај Цитирај 96 пута наведен Сродни чланци Све верзије (10)

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Reinforcement learning with human feedback: Learning dynamic choices via pessimism

Z Li, Z Yang, M Wang - arxiv preprint arxiv:2305.18438, 2023 - arxiv.org

In this paper, we study offline Reinforcement Learning with Human Feedback (RLHF) where
we aim to learn the human's underlying reward and the MDP's optimal policy from a set of …

Сачувај Цитирај 55 пута наведен Сродни чланци Све верзије (4) HTML верзија

Направи обавештење

Цитирај

Напредна претрага

Сачувано у мојој библиотеци

Representation learning for online and offline rl in low-rank mdps

Towards continual reinforcement learning: A review and perspectives

[HTML][HTML] Review of online learning for control and diagnostics of power converters and drives: Algorithms, implementations and applications

Adversarially trained actor critic for offline reinforcement learning

Pessimistic q-learning for offline reinforcement learning: Towards optimal sample complexity

Hybrid rl: Using both offline and online data can make rl efficient

Pessimistic model-based offline reinforcement learning under partial coverage

Settling the sample complexity of model-based offline reinforcement learning

Reinforcement learning with human feedback: Learning dynamic choices via pessimism