Partially Observed Optimal Stochastic Control: Regularity, Optimality, Approximations, and Learning

AD Kara, S Yuksel - arxiv preprint arxiv:2412.06735, 2024 - arxiv.org
In this review/tutorial article, we present recent progress on optimal control of partially
observed Markov Decision Processes (POMDPs). We first present regularity and continuity …

A Theoretical Justification for Asymmetric Actor-Critic Algorithms

G Lambrechts, D Ernst, A Mahajan - arxiv preprint arxiv:2501.19116, 2025 - arxiv.org
In reinforcement learning for partially observable environments, many successful algorithms
were developed within the asymmetric learning paradigm. This paradigm leverages …

Refined Bounds on Near Optimality Finite Window Policies in POMDPs and Their Reinforcement Learning

YE Demirci, AD Kara, S Yüksel - arxiv preprint arxiv:2409.04351, 2024 - arxiv.org
Finding optimal policies for Partially Observable Markov Decision Processes (POMDPs) is
challenging due to their uncountable state spaces when transformed into fully observable …