Google Tudós

Reinforcement learning for logistics and supply chain management: Methodologies, state of the art, and future opportunities

Y Yan, AHF Chow, CP Ho, YH Kuo, Q Wu… - … Research Part E …, 2022 - Elsevier

With advances in technologies, data science techniques, and computing equipment, there
has been rapidly increasing interest in the applications of reinforcement learning (RL) to …

Mentés Hivatkozás Idézetek száma: 139 Kapcsolódó cikkek Mind a(z) 11 változat

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Toward theoretical understandings of robust markov decision processes: Sample complexity and asymptotics

W Yang, L Zhang, Z Zhang - The Annals of Statistics, 2022 - projecteuclid.org

Toward theoretical understandings of robust Markov decision processes: Sample
complexity and asymptotics Page 1 The Annals of Statistics 2022, Vol. 50, No. 6, 3223–3248 …

Mentés Hivatkozás Idézetek száma: 72 Kapcsolódó cikkek Mind a(z) 6 változat

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Policy gradient in robust mdps with global convergence guarantee

Q Wang, CP Ho, M Petrik - International Conference on …, 2023 - proceedings.mlr.press

Abstract Robust Markov decision processes (RMDPs) provide a promising framework for
computing reliable policies in the face of model errors. Many successful reinforcement …

Mentés Hivatkozás Idézetek száma: 24 Kapcsolódó cikkek Mind a(z) 9 változat HTML-változat

[Free GPT-4]
[DeepSeek]

[PDF] jmlr.org

Partial policy iteration for l1-robust markov decision processes

CP Ho, M Petrik, W Wiesemann - Journal of Machine Learning Research, 2021 - jmlr.org

Robust Markov decision processes (MDPs) compute reliable solutions for dynamic decision
problems with partially-known transition probabilities. Unfortunately, accounting for …

Mentés Hivatkozás Idézetek száma: 66 Kapcsolódó cikkek Mind a(z) 10 változat HTML-változat

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Model-based offline reinforcement learning with pessimism-modulated dynamics belief

K Guo, S Yunfeng, Y Geng - Advances in Neural …, 2022 - proceedings.neurips.cc

Abstract Model-based offline reinforcement learning (RL) aims to find highly rewarding
policy, by leveraging a previously collected static dataset and a dynamics model. While the …

Mentés Hivatkozás Idézetek száma: 22 Kapcsolódó cikkek Mind a(z) 6 változat HTML-változat

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Fast Algorithms for -constrained S-rectangular Robust MDPs

B Behzadian, M Petrik, CP Ho - Advances in Neural …, 2021 - proceedings.neurips.cc

Abstract Robust Markov decision processes (RMDPs) are a useful building block of robust
reinforcement learning algorithms but can be hard to solve. This paper proposes a fast …

Mentés Hivatkozás Idézetek száma: 34 Kapcsolódó cikkek Mind a(z) 10 változat HTML-változat

[Free GPT-4]
[DeepSeek]

[PDF] jmlr.org

Value-distributional model-based reinforcement learning

CE Luis, AG Bottero, J Vinogradska… - Journal of Machine …, 2024 - jmlr.org

Quantifying uncertainty about a policy's long-term performance is important to solve
sequential decision-making tasks. We study the problem from a model-based Bayesian …

Mentés Hivatkozás Idézetek száma: 7 Kapcsolódó cikkek Mind a(z) 5 változat HTML-változat

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Solving multi-model MDPs by coordinate ascent and dynamic programming

X Su, M Petrik - Uncertainty in Artificial Intelligence, 2023 - proceedings.mlr.press

Multi-model Markov decision process (MMDP) is a promising framework for computing
policies that are robust to parameter uncertainty in MDPs. MMDPs aim to find a policy that …

Mentés Hivatkozás Idézetek száma: 9 Kapcsolódó cikkek Mind a(z) 10 változat HTML-változat

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Robust satisficing mdps

H Ruan, S Zhou, Z Chen, CP Ho - … Conference on Machine …, 2023 - proceedings.mlr.press

Despite being a fundamental building block for reinforcement learning, Markov decision
processes (MDPs) often suffer from ambiguity in model parameters. Robust MDPs are …

Mentés Hivatkozás Idézetek száma: 3 Kapcsolódó cikkek Mind a(z) 5 változat HTML-változat

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Percentile criterion optimization in offline reinforcement learning

C Cousins, E Lobo, M Petrik… - Advances in Neural …, 2023 - proceedings.neurips.cc

In reinforcement learning, robust policies for high-stakes decision-making problems with
limited data are usually computed by optimizing the percentile criterion. The percentile …

Mentés Hivatkozás Idézetek száma: 2 Kapcsolódó cikkek Mind a(z) 8 változat HTML-változat

Értesítés létrehozása

Hivatkozás

Speciális keresés

Mentve a Saját könyvtárba

Optimizing percentile criterion using robust MDPs

Reinforcement learning for logistics and supply chain management: Methodologies, state of the art, and future opportunities

Toward theoretical understandings of robust markov decision processes: Sample complexity and asymptotics

Policy gradient in robust mdps with global convergence guarantee

Partial policy iteration for l1-robust markov decision processes

Model-based offline reinforcement learning with pessimism-modulated dynamics belief

Fast Algorithms for -constrained S-rectangular Robust MDPs

Value-distributional model-based reinforcement learning

Solving multi-model MDPs by coordinate ascent and dynamic programming

Robust satisficing mdps

Percentile criterion optimization in offline reinforcement learning