- Academic Search

I Fatkhullin, A Barakat, A Kireeva… - … Conference on Machine …, 2023 - proceedings.mlr.press

Recently, the impressive empirical success of policy gradient (PG) methods has catalyzed
the development of their theoretical foundations. Despite the huge efforts directed at the …

Enregistrer Citer Cité 44 fois Autres articles Les 8 versions Free GPT-4 DeepSeek Version HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

A Fisher-Rao gradient flow for entropy-regularised Markov decision processes in Polish spaces

B Kerimkulov, JM Leahy, D Siska, L Szpruch… - arxiv preprint arxiv …, 2023 - arxiv.org

We study the global convergence of a Fisher-Rao policy gradient flow for infinite-horizon
entropy-regularised Markov decision processes with Polish state and action space. The flow …

Enregistrer Citer Cité 9 fois Autres articles Les 2 versions Free GPT-4 DeepSeek Version HTML

[Free GPT-4]
[DeepSeek]

[PDF] springer.com

Geometry and convergence of natural policy gradient methods

J Müller, G Montúfar - Information Geometry, 2024 - Springer

We study the convergence of several natural policy gradient (NPG) methods in infinite-
horizon discounted Markov decision processes with regular policy parametrizations. For a …

Enregistrer Citer Cité 8 fois Autres articles Les 9 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

On the global convergence of fitted Q-iteration with two-layer neural network parametrization

M Gaur, V Aggarwal, M Agarwal - … Conference on Machine …, 2023 - proceedings.mlr.press

Deep Q-learning based algorithms have been applied successfully in many decision making
problems, while their theoretical foundations are not as well understood. In this paper, we …

Enregistrer Citer Cité 3 fois Autres articles Les 7 versions Free GPT-4 DeepSeek Version HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Convex Regularization and Convergence of Policy Gradient Flows under Safety Constraints

P Malo, L Viitasaari, A Suominen, E Vilkkumaa… - arxiv preprint arxiv …, 2024 - arxiv.org

This paper studies reinforcement learning (RL) in infinite-horizon dynamic decision
processes with almost-sure safety constraints. Such safety-constrained decision processes …

Enregistrer Citer Autres articles Les 3 versions Free GPT-4 DeepSeek Version HTML

Geometry of Optimization in Markov Decision Processes and Neural Network-Based PDE Solvers

J Müller - 2023 - ul.qucosa.de

Abstract (EN) This thesis is divided into two parts dealing with the optimization problems in
Markov decision processes (MDPs) and different neural network-based numerical solvers …

Enregistrer Citer Cité 2 fois Autres articles En cache

[Free GPT-4]
[DeepSeek]

[PDF] mpg.de

[PDF][PDF] Geometry and convergence of natural policy gradient methods

G Montúfar, J Müller - 2022 - mis.mpg.de

We study the convergence of several natural policy gradient (NPG) methods in infinite-
horizon discounted Markov decision processes with regular policy parametrizations. For a …

Enregistrer Citer Autres articles Les 3 versions Free GPT-4 DeepSeek Version HTML

[Free GPT-4]
[DeepSeek]

[PDF] core.ac.uk

[LIVRE][B] The development of data-driven methods for modelling and optimisation of chemical process systems

M Mowbray - 2022 - search.proquest.com

In this thesis, data driven approaches to sequential decision making problems within
process systems engineering (PSE) are developed. Specifically, the use of model-free …

Enregistrer Citer Autres articles Les 5 versions Free GPT-4 DeepSeek Recherche dans les bibliothèques

Créer l'alerte

Citer

Recherche avancée

Enregistré dans Ma bibliothèque

Convergence and optimality of policy gradient methods in weakly smooth settings

Stochastic policy gradient methods: Improved sample complexity for fisher-non-degenerate policies

A Fisher-Rao gradient flow for entropy-regularised Markov decision processes in Polish spaces

Geometry and convergence of natural policy gradient methods

On the global convergence of fitted Q-iteration with two-layer neural network parametrization

Convex Regularization and Convergence of Policy Gradient Flows under Safety Constraints

Geometry of Optimization in Markov Decision Processes and Neural Network-Based PDE Solvers

[PDF][PDF] Geometry and convergence of natural policy gradient methods

[LIVRE][B] The development of data-driven methods for modelling and optimisation of chemical process systems