- Academic Search

Turnitin 降AI改写早检测系统早降重系统 Turnitin-UK版万方检测-期刊版维普编辑部版 Grammarly检测 Paperpass检测 checkpass检测 PaperYY检测

A review of safe reinforcement learning: Methods, theory and applications

S Gu, L Yang, Y Du, G Chen, F Walter, J Wang… - arxiv preprint arxiv …, 2022 - arxiv.org

Reinforcement Learning (RL) has achieved tremendous success in many complex decision-
making tasks. However, safety concerns are raised during deploying RL in real-world …

Lagre Referanse Sitert av 306 Beslektede artikler Alle 3 versjoner HTML-versjon

[Free GPT-4]
[DeepSeek]

[PDF] kcl.ac.uk

A review of safe reinforcement learning: Methods, theories and applications

S Gu, L Yang, Y Du, G Chen, F Walter… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org

Reinforcement Learning (RL) has achieved tremendous success in many complex decision-
making tasks. However, safety concerns are raised during deploying RL in real-world …

Lagre Referanse Sitert av 21 Beslektede artikler Alle 8 versjoner

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Nearly minimax optimal reinforcement learning for linear mixture markov decision processes

D Zhou, Q Gu, C Szepesvari - Conference on Learning …, 2021 - proceedings.mlr.press

We study reinforcement learning (RL) with linear function approximation where the
underlying transition probability kernel of the Markov decision process (MDP) is a linear …

Lagre Referanse Sitert av 247 Beslektede artikler Alle 8 versjoner HTML-versjon

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Nearly minimax optimal reinforcement learning for linear markov decision processes

J He, H Zhao, D Zhou, Q Gu - International Conference on …, 2023 - proceedings.mlr.press

We study reinforcement learning (RL) with linear function approximation. For episodic time-
inhomogeneous linear Markov decision processes (linear MDPs) whose transition …

Lagre Referanse Sitert av 60 Beslektede artikler Alle 8 versjoner HTML-versjon

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Settling the sample complexity of model-based offline reinforcement learning

G Li, L Shi, Y Chen, Y Chi, Y Wei - The Annals of Statistics, 2024 - projecteuclid.org

Settling the sample complexity of model-based offline reinforcement learning Page 1 The
Annals of Statistics 2024, Vol. 52, No. 1, 233–260 https://doi.org/10.1214/23-AOS2342 © …

Lagre Referanse Sitert av 95 Beslektede artikler Alle 10 versjoner

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

VOL: Towards Optimal Regret in Model-free RL with Nonlinear Function Approximation

A Agarwal, Y **, T Zhang - The Thirty Sixth Annual …, 2023 - proceedings.mlr.press

We study time-inhomogeneous episodic reinforcement learning (RL) under general function
approximation and sparse rewards. We design a new algorithm, Variance-weighted …

Lagre Referanse Sitert av 48 Beslektede artikler Alle 5 versjoner HTML-versjon

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Breaking the sample complexity barrier to regret-optimal model-free reinforcement learning

G Li, L Shi, Y Chen, Y Gu, Y Chi - Advances in Neural …, 2021 - proceedings.neurips.cc

Achieving sample efficiency in online episodic reinforcement learning (RL) requires
optimally balancing exploration and exploitation. When it comes to a finite-horizon episodic …

Lagre Referanse Sitert av 64 Beslektede artikler Alle 13 versjoner HTML-versjon

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Nearly minimax optimal reinforcement learning with linear function approximation

P Hu, Y Chen, L Huang - International Conference on …, 2022 - proceedings.mlr.press

We study reinforcement learning with linear function approximation where the transition
probability and reward functions are linear with respect to a feature map** $\boldsymbol …

Lagre Referanse Sitert av 35 Beslektede artikler Alle 6 versjoner HTML-versjon

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Made: Exploration via maximizing deviation from explored regions

T Zhang, P Rashidinejad, J Jiao… - Advances in …, 2021 - proceedings.neurips.cc

In online reinforcement learning (RL), efficient exploration remains particularly challenging
in high-dimensional environments with sparse rewards. In low-dimensional environments …

Lagre Referanse Sitert av 51 Beslektede artikler Alle 7 versjoner HTML-versjon

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Learning stochastic shortest path with linear function approximation

Y Min, J He, T Wang, Q Gu - International Conference on …, 2022 - proceedings.mlr.press

We study the stochastic shortest path (SSP) problem in reinforcement learning with linear
function approximation, where the transition kernel is represented as a linear mixture of …

Lagre Referanse Sitert av 33 Beslektede artikler Alle 10 versjoner HTML-versjon

Referanse

Avansert søk

Lagret i Mitt bibliotek

A review of safe reinforcement learning: Methods, theory and applications

A review of safe reinforcement learning: Methods, theories and applications

Nearly minimax optimal reinforcement learning for linear mixture markov decision processes

Nearly minimax optimal reinforcement learning for linear markov decision processes

Settling the sample complexity of model-based offline reinforcement learning

VOL: Towards Optimal Regret in Model-free RL with Nonlinear Function Approximation

Breaking the sample complexity barrier to regret-optimal model-free reinforcement learning

Nearly minimax optimal reinforcement learning with linear function approximation

Made: Exploration via maximizing deviation from explored regions

Learning stochastic shortest path with linear function approximation