A survey on causal reinforcement learning

Y Zeng, R Cai, F Sun, L Huang… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
While reinforcement learning (RL) achieves tremendous success in sequential decision-
making problems of many domains, it still faces key challenges of data inefficiency and the …

Mocoda: Model-based counterfactual data augmentation

S Pitis, E Creager, A Mandlekar… - Advances in Neural …, 2022 - proceedings.neurips.cc
The number of states in a dynamic process is exponential in the number of objects, making
reinforcement learning (RL) difficult in complex, multi-object domains. For agents to scale to …

Proximal reinforcement learning: Efficient off-policy evaluation in partially observed markov decision processes

A Bennett, N Kallus - Operations Research, 2024 - pubsonline.informs.org
In applications of offline reinforcement learning to observational data, such as in healthcare
or education, a general concern is that observed actions might be affected by unobserved …

Offline imitation learning with variational counterfactual reasoning

Z Sun, B He, J Liu, X Chen, C Ma… - Advances in Neural …, 2023 - proceedings.neurips.cc
In offline imitation learning (IL), an agent aims to learn an optimal expert behavior policy
without additional online environment interactions. However, in many real-world scenarios …

[PDF][PDF] Causal inference q-network: Toward resilient reinforcement learning

CHH Yang, I Hung, T Danny, Y Ouyang… - arxiv preprint arxiv …, 2021 - ask.qcloudimg.com
Deep reinforcement learning (DRL) has demonstrated impressive performance in various
gaming simulators and real-world applications. In practice, however, a DRL agent may …

Offline imitation learning with variational counterfactual reasoning

B He, Z Sun, J Liu, S Zhang, X Chen, C Ma - arxiv preprint arxiv …, 2023 - arxiv.org
In offline Imitation Learning (IL), an agent aims to learn an optimal expert behavior policy
without additional online environment interactions. However, in many real-world scenarios …

Training a resilient q-network against observational interference

CHH Yang, ITD Hung, Y Ouyang… - Proceedings of the AAAI …, 2022 - ojs.aaai.org
Deep reinforcement learning (DRL) has demonstrated impressive performance in various
gaming simulators and real-world applications. In practice, however, a DRL agent may …

Fine-Grained Causal Dynamics Learning with Quantization for Improving Robustness in Reinforcement Learning

I Hwang, Y Kwak, S Choi, BT Zhang, S Lee - arxiv preprint arxiv …, 2024 - arxiv.org
Causal dynamics learning has recently emerged as a promising approach to enhancing
robustness in reinforcement learning (RL). Typically, the goal is to build a dynamics model …

Learning under adversarial and interventional shifts

H Singh, S Joshi, F Doshi-Velez… - arxiv preprint arxiv …, 2021 - arxiv.org
Machine learning models are often trained on data from one distribution and deployed on
others. So it becomes important to design models that are robust to distribution shifts. Most of …

Towards robust off-policy evaluation via human inputs

H Singh, S Joshi, F Doshi-Velez… - Proceedings of the 2022 …, 2022 - dl.acm.org
Off-policy Evaluation (OPE) methods are crucial tools for evaluating policies in high-stakes
domains such as healthcare, where direct deployment is often infeasible, unethical, or …