Causal reinforcement learning using observational and interventional data

M Gasse, D Grasset, G Gaudron… - arxiv preprint arxiv …, 2021 - arxiv.org
Learning efficiently a causal model of the environment is a key challenge of model-based
RL agents operating in POMDPs. We consider here a scenario where the learning agent …

Reinforcement learning with non-exponential discounting

M Schultheis, CA Rothkopf… - Advances in neural …, 2022 - proceedings.neurips.cc
Commonly in reinforcement learning (RL), rewards are discounted over time using an
exponential function to model time preference, thereby bounding the expected long-term …

Revisiting Continuous-Time Reinforcement Learning. A Study of HJB Solvers Based on PINNs and FEMs

A Shilova, T Delliaux, P Preux… - … European Workshop on …, 2023 - openreview.net
Despite recent advances in Reinforcement Learning (RL), the Markov Decision Processes
(MDPs) are not always the best choice to model complex dynamical systems requiring …

Analytical solution to a discrete-time model for dynamic learning and decision making

H Zhang - Management Science, 2022 - pubsonline.informs.org
Problems concerning dynamic learning and decision making are difficult to solve
analytically. We study an infinite-horizon discrete-time model with a constant unknown state …

Using Confounded Data in Latent Model-Based Reinforcement Learning

M Gasse, D Grasset, G Gaudron… - … on Machine Learning …, 2023 - inria.hal.science
In the presence of confounding, naively using off-the-shelf offline reinforcement learning
(RL) algorithms leads to sub-optimal behaviour. In this work, we propose a safe method to …

Improving quantum state detection with adaptive sequential observations

S Geller, DC Cole, S Glancy… - Quantum Science and …, 2022 - iopscience.iop.org
For many quantum systems intended for information processing, one detects the logical
state of a qubit by integrating a continuously observed quantity over time. For example, ion …

Models as a Key Factor of Environments Design in Multi-Agent Reinforcement Learning

KA Morozov - 2024 6th International Youth Conference on …, 2024 - ieeexplore.ieee.org
This paper presents a generalized graphical outline of the current state of the art in the
development of Markov decision processes (MDPs). This systematization opens ways to …

[BOOK][B] Efficient Learning of Continuous-Time Hidden Markov Models with Discrete-Time Irregular Observations for Healthcare Intervention Planning

S Ghodsi - 2022 - search.proquest.com
The availability of vast amounts of electronic medical records data has inspired an
increasing interest in data-driven healthcare intervention planning methods. Disease …

Simultaneous online model identification and production optimization using modifier adaptation

J Matias, V Kungurtsev, M Egan - Journal of Process Control, 2022 - Elsevier
A key problem for many industrial processes is to limit exposure to system malfunction. The
system health state can be represented by different models. However, it is often the case that …

Learning HJB Viscosity Solutions with PINNs for Continuous-Time Reinforcement Learning

A Shilova, T Delliaux, P Preux, B Raffin - 2024 - inria.hal.science
Despite recent advances in Reinforcement Learning (RL), the Markov Decision Processes
are not always the best choice to model complex dynamical systems requiring interactions at …