Google Tudós

Turnitin 降AI改写早检测系统早降重系统 Turnitin-UK版万方检测-期刊版维普编辑部版 Grammarly检测 Paperpass检测 checkpass检测 PaperYY检测

Policy gradient for rectangular robust markov decision processes

N Kumar, E Derman, M Geist… - Advances in Neural …, 2023 - proceedings.neurips.cc

Policy gradient methods have become a standard for training reinforcement learning agents
in a scalable and efficient manner. However, they do not account for transition uncertainty …

Mentés Hivatkozás Idézetek száma: 35 Kapcsolódó cikkek Mind a(z) 5 változat HTML-változat

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Soft robust MDPs and risk-sensitive MDPs: Equivalence, policy gradient, and sample complexity

R Zhang, Y Hu, N Li - ar** in uncertainty: Robustness and regularization in markov games

J McMahan, G Artiglio, Q **e - arxiv preprint arxiv:2406.08847, 2024 - arxiv.org

We study robust Markov games (RMG) with $ s $-rectangular uncertainty. We show a
general equivalence between computing a robust Nash equilibrium (RNE) of a $ s …

Mentés Hivatkozás Idézetek száma: 3 Kapcsolódó cikkek Mind a(z) 6 változat HTML-változat

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Bridging distributionally robust learning and offline rl: An approach to mitigate distribution shift and partial data coverage

K Panaganti, Z Xu, D Kalathil… - arxiv preprint arxiv …, 2023 - arxiv.org

The goal of an offline reinforcement learning (RL) algorithm is to learn optimal polices using
historical (offline) data, without access to the environment for online exploration. One of the …

Mentés Hivatkozás Idézetek száma: 6 Kapcsolódó cikkek Mind a(z) 3 változat HTML-változat

[Free GPT-4]
[DeepSeek]

[PDF] openreview.net

Bring your own (non-robust) algorithm to solve robust MDPs by estimating the worst kernel

U Gadot, K Wang, N Kumar, KY Levy… - Forty-first International …, 2024 - openreview.net

Robust Markov Decision Processes (RMDPs) provide a framework for sequential decision-
making that is robust to perturbations on the transition kernel. However, current RMDP …

Mentés Hivatkozás Idézetek száma: 1 Kapcsolódó cikkek Mind a(z) 3 változat HTML-változat

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Imprecise probabilities meet partial observability: Game semantics for robust POMDPs

EM Bovy, M Suilen, S Junges, N Jansen - arxiv preprint arxiv:2405.04941, 2024 - arxiv.org

Partially observable Markov decision processes (POMDPs) rely on the key assumption that
probability distributions are precisely known. Robust POMDPs (RPOMDPs) alleviate this …

Mentés Hivatkozás Idézetek száma: 3 Kapcsolódó cikkek Mind a(z) 7 változat Könyvtári keresés HTML-változat

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Robust markov decision processes: A place where AI and formal methods meet

M Suilen, T Badings, EM Bovy, D Parker… - Principles of Verification …, 2024 - Springer

Markov decision processes (MDPs) are a standard model for sequential decision-making
problems and are widely used across many scientific areas, including formal methods and …

Mentés Hivatkozás Idézetek száma: 2 Kapcsolódó cikkek Mind a(z) 9 változat

Hivatkozás

Speciális keresés

Mentve a Saját könyvtárba

Policy gradient for rectangular robust markov decision processes

Soft robust MDPs and risk-sensitive MDPs: Equivalence, policy gradient, and sample complexity

Bridging distributionally robust learning and offline rl: An approach to mitigate distribution shift and partial data coverage

Bring your own (non-robust) algorithm to solve robust MDPs by estimating the worst kernel

Imprecise probabilities meet partial observability: Game semantics for robust POMDPs

Robust markov decision processes: A place where AI and formal methods meet