- Academic Search

K Zhang, Z Yang, T Başar - Handbook of reinforcement learning and …, 2021 - Springer

Recent years have witnessed significant advances in reinforcement learning (RL), which
has registered tremendous success in solving various sequential decision-making problems …

Gem Citer Citeret af 1719 Relaterede artikler Alle 8 versioner

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Reinforcement learning for selective key applications in power systems: Recent advances and future challenges

X Chen, G Qu, Y Tang, S Low… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org

With large-scale integration of renewable generation and distributed energy resources,
modern power systems are confronted with new operational challenges, such as growing …

Gem Citer Citeret af 283 Relaterede artikler Alle 6 versioner

[BOG][B] Control systems and reinforcement learning

S Meyn - 2022 - books.google.com

A high school student can create deep Q-learning code to control her robot, without any
understanding of the meaning of'deep'or'Q', or why the code sometimes fails. This book is …

Gem Citer Citeret af 158 Relaterede artikler Alle 3 versioner Bibliotekssøgning

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Online robust reinforcement learning with model uncertainty

Y Wang, S Zou - Advances in Neural Information Processing …, 2021 - proceedings.neurips.cc

Robust reinforcement learning (RL) is to find a policy that optimizes the worst-case
performance over an uncertainty set of MDPs. In this paper, we focus on model-free robust …

Gem Citer Citeret af 114 Relaterede artikler Alle 10 versioner Vis som HTML

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Federated reinforcement learning: Linear speedup under markovian sampling

S Khodadadian, P Sharma, G Joshi… - International …, 2022 - proceedings.mlr.press

Since reinforcement learning algorithms are notoriously data-intensive, the task of sampling
observations from the environment is usually split across multiple agents. However …

Gem Citer Citeret af 75 Relaterede artikler Alle 7 versioner Vis som HTML

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Crpo: A new approach for safe reinforcement learning with convergence guarantee

T Xu, Y Liang, G Lan - International Conference on Machine …, 2021 - proceedings.mlr.press

In safe reinforcement learning (SRL) problems, an agent explores the environment to
maximize an expected total reward and meanwhile avoids violation of certain constraints on …

Gem Citer Citeret af 153 Relaterede artikler Alle 7 versioner Vis som HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Finite-sample analysis for sarsa with linear function approximation

S Zou, T Xu, Y Liang - Advances in neural information …, 2019 - proceedings.neurips.cc

SARSA is an on-policy algorithm to learn a Markov decision process policy in reinforcement
learning. We investigate the SARSA algorithm with linear function approximation under the …

Gem Citer Citeret af 209 Relaterede artikler Alle 9 versioner Vis som HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

A finite-time analysis of two time-scale actor-critic methods

YF Wu, W Zhang, P Xu, Q Gu - Advances in Neural …, 2020 - proceedings.neurips.cc

Actor-critic (AC) methods have exhibited great empirical success compared with other
reinforcement learning algorithms, where the actor uses the policy gradient to improve the …

Gem Citer Citeret af 166 Relaterede artikler Alle 7 versioner Vis som HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Breaking the sample size barrier in model-based reinforcement learning with a generative model

G Li, Y Wei, Y Chi, Y Gu… - Advances in neural …, 2020 - proceedings.neurips.cc

We investigate the sample efficiency of reinforcement learning in a $\gamma $-discounted
infinite-horizon Markov decision process (MDP) with state space S and action space A …

Gem Citer Citeret af 143 Relaterede artikler Alle 10 versioner Vis som HTML

[Free GPT-4]
[DeepSeek]

[HTML] informs.org

Is Q-learning minimax optimal? a tight sample complexity analysis

G Li, C Cai, Y Chen, Y Wei, Y Chi - Operations Research, 2024 - pubsonline.informs.org

Q-learning, which seeks to learn the optimal Q-function of a Markov decision process (MDP)
in a model-free fashion, lies at the heart of reinforcement learning. When it comes to the …

Gem Citer Citeret af 101 Relaterede artikler Alle 11 versioner

Opret underretning

Citer

Avanceret søgning

Gemt i Min samling

Finite-time error bounds for linear stochastic approximation andtd learning

Multi-agent reinforcement learning: A selective overview of theories and algorithms

Reinforcement learning for selective key applications in power systems: Recent advances and future challenges

[BOG][B] Control systems and reinforcement learning

Online robust reinforcement learning with model uncertainty

Federated reinforcement learning: Linear speedup under markovian sampling

Crpo: A new approach for safe reinforcement learning with convergence guarantee

Finite-sample analysis for sarsa with linear function approximation

A finite-time analysis of two time-scale actor-critic methods

Breaking the sample size barrier in model-based reinforcement learning with a generative model

Is Q-learning minimax optimal? a tight sample complexity analysis