- Academic Search

Y Zeng, R Cai, F Sun, L Huang… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org

While reinforcement learning (RL) achieves tremendous success in sequential decision-
making problems of many domains, it still faces key challenges of data inefficiency and the …

Enregistrer Citer Cité 23 fois Autres articles Les 2 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Causal reinforcement learning: A survey

Z Deng, J Jiang, G Long, C Zhang - arxiv preprint arxiv:2307.01452, 2023 - arxiv.org

Reinforcement learning is an essential paradigm for solving sequential decision problems
under uncertainty. Despite many remarkable achievements in recent decades, applying …

Enregistrer Citer Cité 15 fois Autres articles Les 5 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] neurips.cc

Causal confusion in imitation learning

P De Haan, D Jayaraman… - Advances in neural …, 2019 - proceedings.neurips.cc

Behavioral cloning reduces policy learning to supervised learning by training a
discriminative model to predict expert actions given observations. Such discriminative …

Enregistrer Citer Cité 372 fois Autres articles Les 12 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] neurips.cc

Structural causal bandits: Where to intervene?

S Lee, E Bareinboim - Advances in neural information …, 2018 - proceedings.neurips.cc

We study the problem of identifying the best action in a sequential decision-making setting
when the reward distributions of the arms exhibit a non-trivial dependence structure, which …

Enregistrer Citer Cité 118 fois Autres articles Les 7 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] neurips.cc

Causal bandits with unknown graph structure

Y Lu, A Meisami, A Tewari - Advances in Neural …, 2021 - proceedings.neurips.cc

In causal bandit problems the action set consists of interventions on variables of a causal
graph. Several researchers have recently studied such bandit problems and pointed out …

Enregistrer Citer Cité 49 fois Autres articles Les 10 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] mlr.press

Regret analysis of bandit problems with causal background knowledge

Y Lu, A Meisami, A Tewari… - … on Uncertainty in Artificial …, 2020 - proceedings.mlr.press

We study how to learn optimal interventions sequentially given causal information
represented as a causal graph along with associated conditional distributions. Causal …

Enregistrer Citer Cité 76 fois Autres articles Les 10 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] arxiv.org

Active learning for optimal intervention design in causal models

J Zhang, L Cammarata, C Squires, TP Sapsis… - Nature Machine …, 2023 - nature.com

Sequential experimental design to discover interventions that achieve a desired outcome is
a key problem in various domains including science, engineering and public policy. When …

Enregistrer Citer Cité 30 fois Autres articles Les 5 versions Free GPT-4

[Free GPT-4]

[PDF] neurips.cc

Provably efficient causal reinforcement learning with confounded observational data

L Wang, Z Yang, Z Wang - Advances in Neural Information …, 2021 - proceedings.neurips.cc

Empowered by neural networks, deep reinforcement learning (DRL) achieves tremendous
empirical success. However, DRL requires a large dataset by interacting with the …

Enregistrer Citer Cité 67 fois Autres articles Les 6 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] mlr.press

Budgeted and non-budgeted causal bandits

V Nair, V Patil, G Sinha - International Conference on …, 2021 - proceedings.mlr.press

Learning good interventions in a causal graph can be modelled as a stochastic multi-armed
bandit problem with side-information. First, we study this problem when interventions are …

Enregistrer Citer Cité 47 fois Autres articles Les 4 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] jmlr.org

Causal bandits for linear structural equation models

B Varici, K Shanmugam, P Sattigeri, A Tajer - Journal of Machine Learning …, 2023 - jmlr.org

This paper studies the problem of designing an optimal sequence of interventions in a
causal graphical model to minimize cumulative regret with respect to the best intervention in …

Enregistrer Citer Cité 14 fois Autres articles Les 6 versions Free GPT-4 Version HTML

Citer

Recherche avancée

Enregistré dans Ma bibliothèque

A survey on causal reinforcement learning

Causal reinforcement learning: A survey

Causal confusion in imitation learning

Structural causal bandits: Where to intervene?

Causal bandits with unknown graph structure

Regret analysis of bandit problems with causal background knowledge

Active learning for optimal intervention design in causal models

Provably efficient causal reinforcement learning with confounded observational data

Budgeted and non-budgeted causal bandits

Causal bandits for linear structural equation models