الباحث العلمي من Google

S Teng, X Hu, P Deng, B Li, Y Li, Y Ai… - IEEE Transactions …, 2023‏ - ieeexplore.ieee.org‏

Intelligent vehicles (IVs) have gained worldwide attention due to their increased
convenience, safety advantages, and potential commercial value. Despite predictions of …‏

حفظ اقتباس تم اقتباسها في عدد: 404 مقالات ذات صلة الإصدارات الـ 5كلها

Knowledge-integrated machine learning for materials: lessons from gameplaying and robotics‏

K Hippalgaonkar, Q Li, X Wang, JW Fisher III… - Nature Reviews …, 2023‏ - nature.com‏

As materials researchers increasingly embrace machine-learning (ML) methods, it is natural
to wonder what lessons can be learned from other fields undergoing similar developments …‏

حفظ اقتباس تم اقتباسها في عدد: 90 مقالات ذات صلة الإصدارات الـ 4كلها

[Free GPT-4]

[PDF] arxiv.org

Open problems and fundamental limitations of reinforcement learning from human feedback‏

S Casper, X Davies, C Shi, TK Gilbert… - arxiv preprint arxiv …, 2023‏ - arxiv.org‏

Reinforcement learning from human feedback (RLHF) is a technique for training AI systems
to align with human goals. RLHF has emerged as the central method used to finetune state …‏

حفظ اقتباس تم اقتباسها في عدد: 439 مقالات ذات صلة الإصدارات الـ 6كلها إصدار HTML‏

[Free GPT-4]

[PDF] mlr.press

Principled reinforcement learning with human feedback from pairwise or k-wise comparisons‏

B Zhu, M Jordan, J Jiao - International Conference on …, 2023‏ - proceedings.mlr.press‏

We provide a theoretical framework for Reinforcement Learning with Human Feedback
(RLHF). We show that when the underlying true reward is linear, under both Bradley-Terry …‏

حفظ اقتباس تم اقتباسها في عدد: 177 مقالات ذات صلة الإصدارات الـ 8كلها إصدار HTML‏

[Free GPT-4]

[PDF] neurips.cc

Video pretraining (vpt): Learning to act by watching unlabeled online videos‏

B Baker, I Akkaya, P Zhokov… - Advances in …, 2022‏ - proceedings.neurips.cc‏

Pretraining on noisy, internet-scale datasets has been heavily studied as a technique for
training models with broad, general capabilities for text, images, and other modalities …‏

حفظ اقتباس تم اقتباسها في عدد: 288 مقالات ذات صلة الإصدارات الـ 6كلها إصدار HTML‏

[Free GPT-4]

[PDF] arxiv.org

On the opportunities and risks of foundation models‏

R Bommasani, DA Hudson, E Adeli, R Altman… - arxiv preprint arxiv …, 2021‏ - arxiv.org‏

AI is undergoing a paradigm shift with the rise of models (eg, BERT, DALL-E, GPT-3) that are
trained on broad data at scale and are adaptable to a wide range of downstream tasks. We …‏

حفظ اقتباس تم اقتباسها في عدد: 4711 مقالات ذات صلة الإصدارات الـ 2كلها إصدار HTML‏

[Free GPT-4]

[PDF] jair.org Full View‏

A survey of zero-shot generalisation in deep reinforcement learning‏

R Kirk, A Zhang, E Grefenstette, T Rocktäschel - Journal of Artificial …, 2023‏ - jair.org‏

The study of zero-shot generalisation (ZSG) in deep Reinforcement Learning (RL) aims to
produce RL algorithms whose policies generalise well to novel unseen situations at …‏

حفظ اقتباس تم اقتباسها في عدد: 410 مقالات ذات صلة الإصدارات الـ 9كلها إصدار HTML‏

[Free GPT-4]

[PDF] neurips.cc

Defining and characterizing reward gaming‏

J Skalse, N Howe… - Advances in Neural …, 2022‏ - proceedings.neurips.cc‏

We provide the first formal definition of\textbf {reward hacking}, a phenomenon where
optimizing an imperfect proxy reward function, $\mathcal {\tilde {R}} $, leads to poor …‏

حفظ اقتباس تم اقتباسها في عدد: 241 مقالات ذات صلة الإصدارات الـ 7كلها إصدار HTML‏

[Free GPT-4]

[PDF] researchgate.net

Deep reinforcement learning in smart manufacturing: A review and prospects‏

C Li, P Zheng, Y Yin, B Wang, L Wang - CIRP Journal of Manufacturing …, 2023‏ - Elsevier‏

To facilitate the personalized smart manufacturing paradigm with cognitive automation
capabilities, Deep Reinforcement Learning (DRL) has attracted ever-increasing attention by …‏

حفظ اقتباس تم اقتباسها في عدد: 194 مقالات ذات صلة الإصدارات الـ 4كلها

Reinforcement learning algorithms: A brief survey‏

AK Shakya, G Pillai, S Chakrabarty - Expert Systems with Applications, 2023‏ - Elsevier‏

Reinforcement Learning (RL) is a machine learning (ML) technique to learn sequential
decision-making in complex problems. RL is inspired by trial-and-error based human/animal …‏

حفظ اقتباس تم اقتباسها في عدد: 210 مقالات ذات صلة الإصدارات الـ 2كلها

إنشاء تنبيه

اقتباس

بحث متقدم

تم حفظ المقالة في مكتبتي.

Algorithms for inverse reinforcement learning.

Motion planning for autonomous driving: The state of the art and future perspectives‏

Knowledge-integrated machine learning for materials: lessons from gameplaying and robotics‏

Open problems and fundamental limitations of reinforcement learning from human feedback‏

Principled reinforcement learning with human feedback from pairwise or k-wise comparisons‏

Video pretraining (vpt): Learning to act by watching unlabeled online videos‏

On the opportunities and risks of foundation models‏

A survey of zero-shot generalisation in deep reinforcement learning‏

Defining and characterizing reward gaming‏

Deep reinforcement learning in smart manufacturing: A review and prospects‏

Reinforcement learning algorithms: A brief survey‏