- Academic Search

O Lockwood, M Si - Proceedings of the AAAI Conference on Artificial …, 2022‏ - ojs.aaai.org‏

Uncertainty is ubiquitous in games, both in the agents playing games and often in the games
themselves. Working with uncertainty is therefore an important component of successful …‏

שמור צטט צוטט על ידי 66 מאמרים בנושא זה כל 9 הגרסאות פתיחה בתור HTML

Reinforcement learning applied to wastewater treatment process control optimization: Approaches, challenges, and path forward‏

HC Croll, K Ikuma, SK Ong, S Sarkar - Critical Reviews in …, 2023‏ - Taylor & Francis‏

Wastewater treatment process control optimization is a complex task in a highly nonlinear
environment. Reinforcement learning (RL) is a machine learning technique that stands out …‏

שמור צטט צוטט על ידי 22 מאמרים בנושא זה כל 5 הגרסאות

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Mildly conservative q-learning for offline reinforcement learning‏

J Lyu, X Ma, X Li, Z Lu - Advances in Neural Information …, 2022‏ - proceedings.neurips.cc‏

Offline reinforcement learning (RL) defines the task of learning from a static logged dataset
without continually interacting with the environment. The distribution shift between the …‏

שמור צטט צוטט על ידי 133 מאמרים בנושא זה כל 5 הגרסאות פתיחה בתור HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Rorl: Robust offline reinforcement learning via conservative smoothing‏

R Yang, C Bai, X Ma, Z Wang… - Advances in neural …, 2022‏ - proceedings.neurips.cc‏

Offline reinforcement learning (RL) provides a promising direction to exploit massive amount
of offline data for complex decision-making tasks. Due to the distribution shift issue, current …‏

שמור צטט צוטט על ידי 91 מאמרים בנושא זה כל 9 הגרסאות פתיחה בתור HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

A policy-guided imitation approach for offline reinforcement learning‏

H Xu, L Jiang, L Jianxiong… - Advances in neural …, 2022‏ - proceedings.neurips.cc‏

Offline reinforcement learning (RL) methods can generally be categorized into two types: RL-
based and Imitation-based. RL-based methods could in principle enjoy out-of-distribution …‏

שמור צטט צוטט על ידי 75 מאמרים בנושא זה כל 7 הגרסאות פתיחה בתור HTML

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Model-Bellman inconsistency for model-based offline reinforcement learning‏

Y Sun, J Zhang, C Jia, H Lin, J Ye… - … Conference on Machine …, 2023‏ - proceedings.mlr.press‏

For offline reinforcement learning (RL), model-based methods are expected to be data-
efficient as they incorporate dynamics models to generate more data. However, due to …‏

שמור צטט צוטט על ידי 37 מאמרים בנושא זה כל 5 הגרסאות פתיחה בתור HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Reinforcement learning with human feedback: Learning dynamic choices via pessimism‏

Z Li, Z Yang, M Wang - arxiv preprint arxiv:2305.18438, 2023‏ - arxiv.org‏

In this paper, we study offline Reinforcement Learning with Human Feedback (RLHF) where
we aim to learn the human's underlying reward and the MDP's optimal policy from a set of …‏

שמור צטט צוטט על ידי 55 מאמרים בנושא זה כל 4 הגרסאות פתיחה בתור HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Offline multi-agent reinforcement learning with implicit global-to-local value regularization‏

X Wang, H Xu, Y Zheng, X Zhan - Advances in Neural …, 2023‏ - proceedings.neurips.cc‏

Offline reinforcement learning (RL) has received considerable attention in recent years due
to its attractive capability of learning policies from offline datasets without environmental …‏

שמור צטט צוטט על ידי 25 מאמרים בנושא זה כל 6 הגרסאות פתיחה בתור HTML

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

What is essential for unseen goal generalization of offline goal-conditioned rl?‏

R Yang, L Yong, X Ma, H Hu… - … on Machine Learning, 2023‏ - proceedings.mlr.press‏

Offline goal-conditioned RL (GCRL) offers a way to train general-purpose agents from fully
offline datasets. In addition to being conservative within the dataset, the generalization …‏

שמור צטט צוטט על ידי 25 מאמרים בנושא זה כל 7 הגרסאות פתיחה בתור HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Corruption-robust offline reinforcement learning with general function approximation‏

C Ye, R Yang, Q Gu, T Zhang - Advances in Neural …, 2023‏ - proceedings.neurips.cc‏

We investigate the problem of corruption robustness in offline reinforcement learning (RL)
with general function approximation, where an adversary can corrupt each sample in the …‏

שמור צטט צוטט על ידי 18 מאמרים בנושא זה כל 8 הגרסאות פתיחה בתור HTML

יצירת התראה

צטט

חיפוש מתקדם

נשמר בספרייה שלי

Pessimistic bootstrap** for uncertainty-driven offline reinforcement learning

A review of uncertainty for deep reinforcement learning‏

Reinforcement learning applied to wastewater treatment process control optimization: Approaches, challenges, and path forward‏

Mildly conservative q-learning for offline reinforcement learning‏

Rorl: Robust offline reinforcement learning via conservative smoothing‏

A policy-guided imitation approach for offline reinforcement learning‏

Model-Bellman inconsistency for model-based offline reinforcement learning‏

Reinforcement learning with human feedback: Learning dynamic choices via pessimism‏

Offline multi-agent reinforcement learning with implicit global-to-local value regularization‏

What is essential for unseen goal generalization of offline goal-conditioned rl?‏

Corruption-robust offline reinforcement learning with general function approximation‏