Reinforcement learning in healthcare: A survey

C Yu, J Liu, S Nemati, G Yin - ACM Computing Surveys (CSUR), 2021 - dl.acm.org
As a subfield of machine learning, reinforcement learning (RL) aims at optimizing decision
making by using interaction samples of an agent with its environment and the potentially …

Reinforcement learning for intelligent healthcare applications: A survey

A Coronato, M Naeem, G De Pietro… - Artificial intelligence in …, 2020 - Elsevier
Discovering new treatments and personalizing existing ones is one of the major goals of
modern clinical research. In the last decade, Artificial Intelligence (AI) has enabled the …

Towards optimal off-policy evaluation for reinforcement learning with marginalized importance sampling

T **e, Y Ma, YX Wang - Advances in neural information …, 2019 - proceedings.neurips.cc
Motivated by the many real-world applications of reinforcement learning (RL) that require
safe-policy iterations, we consider the problem of off-policy evaluation (OPE)---the problem …

[BOK][B] Statistical methods for dynamic treatment regimes

B Chakraborty, EEM Moodie - 2013 - Springer
This book was written to summarize and describe the state of the art of statistical methods
developed to address questions of estimation and inference for dynamic treatment regimes …

[BOK][B] Reinforcement learning and dynamic programming using function approximators

L Busoniu, R Babuska, B De Schutter, D Ernst - 2017 - taylorfrancis.com
From household appliances to applications in robotics, engineered systems involving
complex dynamics can only be as effective as the algorithms that control them. While …

A reinforcement learning approach to weaning of mechanical ventilation in intensive care units

N Prasad, LF Cheng, C Chivers, M Draugelis… - arxiv preprint arxiv …, 2017 - arxiv.org
The management of invasive mechanical ventilation, and the regulation of sedation and
analgesia during ventilation, constitutes a major part of the care of patients admitted to …

Off-policy policy gradient with state distribution correction

Y Liu, A Swaminathan, A Agarwal… - arxiv preprint arxiv …, 2019 - arxiv.org
We study the problem of off-policy policy optimization in Markov decision processes, and
develop a novel off-policy policy gradient method. Prior off-policy policy gradient …

Experience replay for real-time reinforcement learning control

S Adam, L Busoniu, R Babuska - IEEE Transactions on …, 2011 - ieeexplore.ieee.org
Reinforcement-learning (RL) algorithms can automatically learn optimal control strategies
for nonlinear, possibly stochastic systems. A promising approach for RL control is …

Reinforcement learning design for cancer clinical trials

Y Zhao, MR Kosorok, D Zeng - Statistics in medicine, 2009 - Wiley Online Library
We develop reinforcement learning trials for discovering individualized treatment regimens
for life‐threatening diseases such as cancer. A temporal‐difference learning method called …

Leveraging factored action spaces for efficient offline reinforcement learning in healthcare

S Tang, M Makar, M Sjoding… - Advances in neural …, 2022 - proceedings.neurips.cc
Many reinforcement learning (RL) applications have combinatorial action spaces, where
each action is a composition of sub-actions. A standard RL approach ignores this inherent …