Partially Observed Optimal Stochastic Control: Regularity, Optimality, Approximations, and Learning

AD Kara, S Yuksel - arxiv preprint arxiv:2412.06735, 2024 - arxiv.org
In this review/tutorial article, we present recent progress on optimal control of partially
observed Markov Decision Processes (POMDPs). We first present regularity and continuity …

Convergence of finite memory Q learning for POMDPs and near optimality of learned policies under filter stability

AD Kara, S Yüksel - Mathematics of Operations Research, 2023 - pubsonline.informs.org
In this paper, for partially observed Markov decision problems (POMDPs), we provide the
convergence of a Q learning algorithm for control policies using a finite history of past …

Q-learning for stochastic control under general information structures and non-Markovian environments

AD Kara, S Yuksel - arxiv preprint arxiv:2311.00123, 2023 - arxiv.org
As a primary contribution, we present a convergence theorem for stochastic iterations, and in
particular, Q-learning iterates, under a general, possibly non-Markovian, stochastic …

Q-learning for MDPs with general spaces: Convergence and near optimality via quantization under weak continuity

A Kara, N Saldi, S Yüksel - Journal of Machine Learning Research, 2023 - jmlr.org
Reinforcement learning algorithms often require finiteness of state and action spaces in
Markov decision processes (MDPs)(also called controlled Markov chains) and various …

Near optimality of finite memory feedback policies in partially observed Markov decision processes

A Kara, S Yuksel - Journal of Machine Learning Research, 2022 - jmlr.org
In the theory of Partially Observed Markov Decision Processes (POMDPs), existence of
optimal policies have in general been established via converting the original partially …

Continuous-time Markov decision processes

A Piunovskiy, Y Zhang - Probability Theory and Stochastic Modelling, 2020 - Springer
The study of continuous-time Markov decision processes dates back at least to the 1950s,
shortly after that of its discrete-time analogue. Since then, the theory has rapidly developed …

[BOEK][B] Finite Approximations in discrete-time stochastic control

N Saldi, T Linder, S Yüksel - 2018 - Springer
Control and optimization of dynamical systems in the presence of stochastic uncertainty is a
mature field with a large range of applications. A comprehensive treatment of such problems …

Approximate Nash equilibria in partially observed stochastic games with mean-field interactions

N Saldi, T Başar, M Raginsky - Mathematics of Operations …, 2019 - pubsonline.informs.org
Establishing the existence of Nash equilibria for partially observed stochastic dynamic
games is known to be quite challenging, with the difficulties stemming from the noisy nature …

A universal dynamic program and refined existence results for decentralized stochastic control

S Yüksel - SIAM Journal on Control and Optimization, 2020 - SIAM
For sequential stochastic control problems with standard Borel measurement and control
action spaces, we introduce a general (universally applicable) dynamic programming …

Average Cost Optimality of Partially Observed MDPs: Contraction of Nonlinear Filters and Existence of Optimal Solutions and Approximations

YE Demirci, AD Kara, S Yüksel - SIAM Journal on Control and Optimization, 2024 - SIAM
The average cost optimality is known to be a challenging problem for partially observable
stochastic control, with few results available beyond the finite state, action, and …