A survey on causal reinforcement learning

Y Zeng, R Cai, F Sun, L Huang… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
While reinforcement learning (RL) achieves tremendous success in sequential decision-
making problems of many domains, it still faces key challenges of data inefficiency and the …

Causal reinforcement learning: A survey

Z Deng, J Jiang, G Long, C Zhang - arxiv preprint arxiv:2307.01452, 2023 - arxiv.org
Reinforcement learning is an essential paradigm for solving sequential decision problems
under uncertainty. Despite many remarkable achievements in recent decades, applying …

Causal confusion in imitation learning

P De Haan, D Jayaraman… - Advances in neural …, 2019 - proceedings.neurips.cc
Behavioral cloning reduces policy learning to supervised learning by training a
discriminative model to predict expert actions given observations. Such discriminative …

Structural causal bandits: Where to intervene?

S Lee, E Bareinboim - Advances in neural information …, 2018 - proceedings.neurips.cc
We study the problem of identifying the best action in a sequential decision-making setting
when the reward distributions of the arms exhibit a non-trivial dependence structure, which …

Causal bandits with unknown graph structure

Y Lu, A Meisami, A Tewari - Advances in Neural …, 2021 - proceedings.neurips.cc
In causal bandit problems the action set consists of interventions on variables of a causal
graph. Several researchers have recently studied such bandit problems and pointed out …

Regret analysis of bandit problems with causal background knowledge

Y Lu, A Meisami, A Tewari… - … on Uncertainty in Artificial …, 2020 - proceedings.mlr.press
We study how to learn optimal interventions sequentially given causal information
represented as a causal graph along with associated conditional distributions. Causal …

Active learning for optimal intervention design in causal models

J Zhang, L Cammarata, C Squires, TP Sapsis… - Nature Machine …, 2023 - nature.com
Sequential experimental design to discover interventions that achieve a desired outcome is
a key problem in various domains including science, engineering and public policy. When …

Provably efficient causal reinforcement learning with confounded observational data

L Wang, Z Yang, Z Wang - Advances in Neural Information …, 2021 - proceedings.neurips.cc
Empowered by neural networks, deep reinforcement learning (DRL) achieves tremendous
empirical success. However, DRL requires a large dataset by interacting with the …

Budgeted and non-budgeted causal bandits

V Nair, V Patil, G Sinha - International Conference on …, 2021 - proceedings.mlr.press
Learning good interventions in a causal graph can be modelled as a stochastic multi-armed
bandit problem with side-information. First, we study this problem when interventions are …

Causal bandits for linear structural equation models

B Varici, K Shanmugam, P Sattigeri, A Tajer - Journal of Machine Learning …, 2023 - jmlr.org
This paper studies the problem of designing an optimal sequence of interventions in a
causal graphical model to minimize cumulative regret with respect to the best intervention in …