Causal reinforcement learning using observational and interventional data
Learning efficiently a causal model of the environment is a key challenge of model-based
RL agents operating in POMDPs. We consider here a scenario where the learning agent …
RL agents operating in POMDPs. We consider here a scenario where the learning agent …
Reinforcement learning with non-exponential discounting
Commonly in reinforcement learning (RL), rewards are discounted over time using an
exponential function to model time preference, thereby bounding the expected long-term …
exponential function to model time preference, thereby bounding the expected long-term …
Revisiting Continuous-Time Reinforcement Learning. A Study of HJB Solvers Based on PINNs and FEMs
Despite recent advances in Reinforcement Learning (RL), the Markov Decision Processes
(MDPs) are not always the best choice to model complex dynamical systems requiring …
(MDPs) are not always the best choice to model complex dynamical systems requiring …
Analytical solution to a discrete-time model for dynamic learning and decision making
H Zhang - Management Science, 2022 - pubsonline.informs.org
Problems concerning dynamic learning and decision making are difficult to solve
analytically. We study an infinite-horizon discrete-time model with a constant unknown state …
analytically. We study an infinite-horizon discrete-time model with a constant unknown state …
Using Confounded Data in Latent Model-Based Reinforcement Learning
In the presence of confounding, naively using off-the-shelf offline reinforcement learning
(RL) algorithms leads to sub-optimal behaviour. In this work, we propose a safe method to …
(RL) algorithms leads to sub-optimal behaviour. In this work, we propose a safe method to …
Improving quantum state detection with adaptive sequential observations
For many quantum systems intended for information processing, one detects the logical
state of a qubit by integrating a continuously observed quantity over time. For example, ion …
state of a qubit by integrating a continuously observed quantity over time. For example, ion …
Models as a Key Factor of Environments Design in Multi-Agent Reinforcement Learning
KA Morozov - 2024 6th International Youth Conference on …, 2024 - ieeexplore.ieee.org
This paper presents a generalized graphical outline of the current state of the art in the
development of Markov decision processes (MDPs). This systematization opens ways to …
development of Markov decision processes (MDPs). This systematization opens ways to …
[BOOK][B] Efficient Learning of Continuous-Time Hidden Markov Models with Discrete-Time Irregular Observations for Healthcare Intervention Planning
S Ghodsi - 2022 - search.proquest.com
The availability of vast amounts of electronic medical records data has inspired an
increasing interest in data-driven healthcare intervention planning methods. Disease …
increasing interest in data-driven healthcare intervention planning methods. Disease …
Simultaneous online model identification and production optimization using modifier adaptation
A key problem for many industrial processes is to limit exposure to system malfunction. The
system health state can be represented by different models. However, it is often the case that …
system health state can be represented by different models. However, it is often the case that …
Learning HJB Viscosity Solutions with PINNs for Continuous-Time Reinforcement Learning
Despite recent advances in Reinforcement Learning (RL), the Markov Decision Processes
are not always the best choice to model complex dynamical systems requiring interactions at …
are not always the best choice to model complex dynamical systems requiring interactions at …