Google Učenjak

EH Sumiea, SJ Abdulkadir, HS Alhussian, SM Al-Selwi… - Heliyon, 2024 - cell.com

Abstract Deep Reinforcement Learning (DRL) has gained significant adoption in diverse
fields and applications, mainly due to its proficiency in resolving complicated decision …

Shrani Navedi Navedeno v 28 virih Sorodni članki Vse različice: 11

[Free GPT-4]
[DeepSeek]

[PDF] jmlr.org

Policy gradient and actor-critic learning in continuous time and space: Theory and algorithms

Y Jia, XY Zhou - Journal of Machine Learning Research, 2022 - jmlr.org

We study policy gradient (PG) for reinforcement learning in continuous time and space
under the regularized exploratory formulation developed by Wang et al.(2020). We …

Shrani Navedi Navedeno v 98 virih Sorodni članki Vse različice: 8 V obliki HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Text-based interactive recommendation via constraint-augmented reinforcement learning

R Zhang, T Yu, Y Shen, H **… - Advances in neural …, 2019 - proceedings.neurips.cc

Text-based interactive recommendation provides richer user preferences and has
demonstrated advantages over traditional interactive recommender systems. However …

Shrani Navedi Navedeno v 150 virih Sorodni članki Vse različice: 13 V obliki HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Policy optimization for continuous reinforcement learning

H Zhao, W Tang, D Yao - Advances in Neural Information …, 2023 - proceedings.neurips.cc

We study reinforcement learning (RL) in the setting of continuous time and space, for an
infinite horizon with a discounted objective and the underlying dynamics driven by a …

Shrani Navedi Navedeno v 21 virih Sorodni članki Vse različice: 10 V obliki HTML

[Free GPT-4]
[DeepSeek]

[PDF] mdpi.com

Data-driven robotic manipulation of cloth-like deformable objects: The present, challenges and future prospects

HA Kadi, K Terzić - Sensors, 2023 - mdpi.com

Manipulating cloth-like deformable objects (CDOs) is a long-standing problem in the
robotics community. CDOs are flexible (non-rigid) objects that do not show a detectable level …

Shrani Navedi Navedeno v 11 virih Sorodni članki Vse različice: 16 Posnetek

[Free GPT-4]
[DeepSeek]

[PDF] jmlr.org

q-Learning in continuous time

Y Jia, XY Zhou - Journal of Machine Learning Research, 2023 - jmlr.org

We study the continuous-time counterpart of Q-learning for reinforcement learning (RL)
under the entropy-regularized, exploratory diffusion process formulation introduced by Wang …

Shrani Navedi Navedeno v 40 virih Sorodni članki Vse različice: 7 V obliki HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Model-based reinforcement learning for semi-markov decision processes with neural odes

J Du, J Futoma, F Doshi-Velez - Advances in Neural …, 2020 - proceedings.neurips.cc

We present two elegant solutions for modeling continuous-time dynamics, in a novel model-
based reinforcement learning (RL) framework for semi-Markov decision processes (SMDPs) …

Shrani Navedi Navedeno v 71 virih Sorodni članki Vse različice: 8 V obliki HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Reinforcement learning for jump-diffusions, with financial applications

X Gao, L Li, XY Zhou - arxiv preprint arxiv:2405.16449, 2024 - arxiv.org

We study continuous-time reinforcement learning (RL) for stochastic control in which system
dynamics are governed by jump-diffusion processes. We formulate an entropy-regularized …

Shrani Navedi Navedeno v 10 virih Sorodni članki Vse različice: 5 V obliki HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Efficient exploration in continuous-time model-based reinforcement learning

L Treven, J Hübotter, B Sukhija… - Advances in Neural …, 2023 - proceedings.neurips.cc

Reinforcement learning algorithms typically consider discrete-time dynamics, even though
the underlying systems are often continuous in time. In this paper, we introduce a model …

Shrani Navedi Navedeno v 9 virih Sorodni članki Vse različice: 8 V obliki HTML

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Control frequency adaptation via action persistence in batch reinforcement learning

AM Metelli, F Mazzolini, L Bisi… - International …, 2020 - proceedings.mlr.press

The choice of the control frequency of a system has a relevant impact on the ability of
reinforcement learning algorithms to learn a highly performing policy. In this paper, we …

Shrani Navedi Navedeno v 55 virih Sorodni članki Vse različice: 11 V obliki HTML

Ustvari opozorilo

Navedi

Napredno iskanje

Shranjeno v Mojo knjižnico

Making deep q-learning methods robust to time discretization

Deep deterministic policy gradient algorithm: A systematic review

Policy gradient and actor-critic learning in continuous time and space: Theory and algorithms

Text-based interactive recommendation via constraint-augmented reinforcement learning

Policy optimization for continuous reinforcement learning

Data-driven robotic manipulation of cloth-like deformable objects: The present, challenges and future prospects

q-Learning in continuous time

Model-based reinforcement learning for semi-markov decision processes with neural odes

Reinforcement learning for jump-diffusions, with financial applications

Efficient exploration in continuous-time model-based reinforcement learning

Control frequency adaptation via action persistence in batch reinforcement learning