Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Reinforcement learning for logistics and supply chain management: Methodologies, state of the art, and future opportunities
With advances in technologies, data science techniques, and computing equipment, there
has been rapidly increasing interest in the applications of reinforcement learning (RL) to …
has been rapidly increasing interest in the applications of reinforcement learning (RL) to …
Toward theoretical understandings of robust markov decision processes: Sample complexity and asymptotics
Toward theoretical understandings of robust Markov decision processes: Sample
complexity and asymptotics Page 1 The Annals of Statistics 2022, Vol. 50, No. 6, 3223–3248 …
complexity and asymptotics Page 1 The Annals of Statistics 2022, Vol. 50, No. 6, 3223–3248 …
Policy gradient in robust mdps with global convergence guarantee
Abstract Robust Markov decision processes (RMDPs) provide a promising framework for
computing reliable policies in the face of model errors. Many successful reinforcement …
computing reliable policies in the face of model errors. Many successful reinforcement …
Partial policy iteration for l1-robust markov decision processes
Robust Markov decision processes (MDPs) compute reliable solutions for dynamic decision
problems with partially-known transition probabilities. Unfortunately, accounting for …
problems with partially-known transition probabilities. Unfortunately, accounting for …
Model-based offline reinforcement learning with pessimism-modulated dynamics belief
K Guo, S Yunfeng, Y Geng - Advances in Neural …, 2022 - proceedings.neurips.cc
Abstract Model-based offline reinforcement learning (RL) aims to find highly rewarding
policy, by leveraging a previously collected static dataset and a dynamics model. While the …
policy, by leveraging a previously collected static dataset and a dynamics model. While the …
Fast Algorithms for -constrained S-rectangular Robust MDPs
Abstract Robust Markov decision processes (RMDPs) are a useful building block of robust
reinforcement learning algorithms but can be hard to solve. This paper proposes a fast …
reinforcement learning algorithms but can be hard to solve. This paper proposes a fast …
Value-distributional model-based reinforcement learning
Quantifying uncertainty about a policy's long-term performance is important to solve
sequential decision-making tasks. We study the problem from a model-based Bayesian …
sequential decision-making tasks. We study the problem from a model-based Bayesian …
Solving multi-model MDPs by coordinate ascent and dynamic programming
Multi-model Markov decision process (MMDP) is a promising framework for computing
policies that are robust to parameter uncertainty in MDPs. MMDPs aim to find a policy that …
policies that are robust to parameter uncertainty in MDPs. MMDPs aim to find a policy that …
Robust satisficing mdps
Despite being a fundamental building block for reinforcement learning, Markov decision
processes (MDPs) often suffer from ambiguity in model parameters. Robust MDPs are …
processes (MDPs) often suffer from ambiguity in model parameters. Robust MDPs are …
Percentile criterion optimization in offline reinforcement learning
In reinforcement learning, robust policies for high-stakes decision-making problems with
limited data are usually computed by optimizing the percentile criterion. The percentile …
limited data are usually computed by optimizing the percentile criterion. The percentile …