Μελετητής Google

Y Zeng, R Cai, F Sun, L Huang… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org

While reinforcement learning (RL) achieves tremendous success in sequential decision-
making problems of many domains, it still faces key challenges of data inefficiency and the …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 24 Σχετικά άρθρα Όλες οι 2 εκδοχές

[Free GPT-4]

[PDF] neurips.cc

Mocoda: Model-based counterfactual data augmentation

S Pitis, E Creager, A Mandlekar… - Advances in Neural …, 2022 - proceedings.neurips.cc

The number of states in a dynamic process is exponential in the number of objects, making
reinforcement learning (RL) difficult in complex, multi-object domains. For agents to scale to …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 42 Σχετικά άρθρα Όλες οι 6 εκδοχές Προβολή ως HTML

[Free GPT-4]

[PDF] arxiv.org

Proximal reinforcement learning: Efficient off-policy evaluation in partially observed markov decision processes

A Bennett, N Kallus - Operations Research, 2024 - pubsonline.informs.org

In applications of offline reinforcement learning to observational data, such as in healthcare
or education, a general concern is that observed actions might be affected by unobserved …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 46 Σχετικά άρθρα Όλες οι 7 εκδοχές

[Free GPT-4]

[PDF] neurips.cc

Offline imitation learning with variational counterfactual reasoning

Z Sun, B He, J Liu, X Chen, C Ma… - Advances in Neural …, 2023 - proceedings.neurips.cc

In offline imitation learning (IL), an agent aims to learn an optimal expert behavior policy
without additional online environment interactions. However, in many real-world scenarios …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 6 Σχετικά άρθρα Όλες οι 3 εκδοχές Προβολή ως HTML

[Free GPT-4]

[PDF] qcloudimg.com

[PDF][PDF] Causal inference q-network: Toward resilient reinforcement learning

CHH Yang, I Hung, T Danny, Y Ouyang… - arxiv preprint arxiv …, 2021 - ask.qcloudimg.com

Deep reinforcement learning (DRL) has demonstrated impressive performance in various
gaming simulators and real-world applications. In practice, however, a DRL agent may …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 26 Σχετικά άρθρα Όλες οι 4 εκδοχές Προβολή ως HTML

[Free GPT-4]

[PDF] arxiv.org

Offline imitation learning with variational counterfactual reasoning

B He, Z Sun, J Liu, S Zhang, X Chen, C Ma - arxiv preprint arxiv …, 2023 - arxiv.org

In offline Imitation Learning (IL), an agent aims to learn an optimal expert behavior policy
without additional online environment interactions. However, in many real-world scenarios …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 2 Σχετικά άρθρα Όλες οι 2 εκδοχές Προβολή ως HTML

[Free GPT-4]

[PDF] aaai.org

Training a resilient q-network against observational interference

CHH Yang, ITD Hung, Y Ouyang… - Proceedings of the AAAI …, 2022 - ojs.aaai.org

Deep reinforcement learning (DRL) has demonstrated impressive performance in various
gaming simulators and real-world applications. In practice, however, a DRL agent may …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 17 Σχετικά άρθρα Όλες οι 3 εκδοχές Προβολή ως HTML

[Free GPT-4]

[PDF] arxiv.org

Fine-Grained Causal Dynamics Learning with Quantization for Improving Robustness in Reinforcement Learning

I Hwang, Y Kwak, S Choi, BT Zhang, S Lee - arxiv preprint arxiv …, 2024 - arxiv.org

Causal dynamics learning has recently emerged as a promising approach to enhancing
robustness in reinforcement learning (RL). Typically, the goal is to build a dynamics model …

Αποθήκευση Παράθεση Σχετικά άρθρα Όλες οι 3 εκδοχές Προβολή ως HTML

[Free GPT-4]

[PDF] arxiv.org

Learning under adversarial and interventional shifts

H Singh, S Joshi, F Doshi-Velez… - arxiv preprint arxiv …, 2021 - arxiv.org

Machine learning models are often trained on data from one distribution and deployed on
others. So it becomes important to design models that are robust to distribution shifts. Most of …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 8 Σχετικά άρθρα Όλες οι 6 εκδοχές Προβολή ως HTML

[Free GPT-4]

[PDF] acm.org

Towards robust off-policy evaluation via human inputs

H Singh, S Joshi, F Doshi-Velez… - Proceedings of the 2022 …, 2022 - dl.acm.org

Off-policy Evaluation (OPE) methods are crucial tools for evaluating policies in high-stakes
domains such as healthcare, where direct deployment is often infeasible, unethical, or …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 2 Σχετικά άρθρα Όλες οι 5 εκδοχές

Δημιουργία ειδοποίησης

Παράθεση

Σύνθετη αναζήτηση

Αποθηκεύτηκε στη Βιβλιοθήκη μου

Counterfactually guided policy transfer in clinical settings

A survey on causal reinforcement learning

Mocoda: Model-based counterfactual data augmentation

Proximal reinforcement learning: Efficient off-policy evaluation in partially observed markov decision processes

Offline imitation learning with variational counterfactual reasoning

[PDF][PDF] Causal inference q-network: Toward resilient reinforcement learning

Offline imitation learning with variational counterfactual reasoning

Training a resilient q-network against observational interference

Fine-Grained Causal Dynamics Learning with Quantization for Improving Robustness in Reinforcement Learning

Learning under adversarial and interventional shifts

Towards robust off-policy evaluation via human inputs