الباحث العلمي من Google

J Gao, M Galley, L Li - The 41st international ACM SIGIR conference on …, 2018‏ - dl.acm.org‏

This tutorial surveys neural approaches to conversational AI that were developed in the last
few years. We group conversational systems into three categories:(1) question answering …‏

حفظ اقتباس تم اقتباسها في عدد: 928 مقالات ذات صلة الإصدارات الـ 16كلها

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Bellman-consistent pessimism for offline reinforcement learning‏

T **e, CA Cheng, N Jiang, P Mineiro… - Advances in neural …, 2021‏ - proceedings.neurips.cc‏

The use of pessimism, when reasoning about datasets lacking exhaustive exploration has
recently gained prominence in offline reinforcement learning. Despite the robustness it adds …‏

حفظ اقتباس تم اقتباسها في عدد: 306 مقالات ذات صلة الإصدارات الـ 14كلها إصدار HTML‏

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Bridging offline reinforcement learning and imitation learning: A tale of pessimism‏

P Rashidinejad, B Zhu, C Ma, J Jiao… - Advances in Neural …, 2021‏ - proceedings.neurips.cc‏

Offline (or batch) reinforcement learning (RL) algorithms seek to learn an optimal policy from
a fixed dataset without active data collection. Based on the composition of the offline dataset …‏

حفظ اقتباس تم اقتباسها في عدد: 317 مقالات ذات صلة الإصدارات الـ 8كلها إصدار HTML‏

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

The statistical complexity of interactive decision making‏

DJ Foster, SM Kakade, J Qian, A Rakhlin - arxiv preprint arxiv:2112.13487, 2021‏ - arxiv.org‏

A fundamental challenge in interactive learning and decision making, ranging from bandit
problems to reinforcement learning, is to provide sample-efficient, adaptive learning …‏

حفظ اقتباس تم اقتباسها في عدد: 207 مقالات ذات صلة الإصدارات الـ 6كلها إصدار HTML‏

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Dive into deep learning‏

A Zhang, ZC Lipton, M Li, AJ Smola - arxiv preprint arxiv:2106.11342, 2021‏ - arxiv.org‏

This open-source book represents our attempt to make deep learning approachable,
teaching readers the concepts, the context, and the code. The entire book is drafted in …‏

حفظ اقتباس تم اقتباسها في عدد: 1225 مقالات ذات صلة الإصدارات الـ 9كلها إصدار HTML‏

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Bilinear classes: A structural framework for provable generalization in rl‏

S Du, S Kakade, J Lee, S Lovett… - International …, 2021‏ - proceedings.mlr.press‏

Abstract This work introduces Bilinear Classes, a new structural framework, which permit
generalization in reinforcement learning in a wide variety of settings through the use of …‏

حفظ اقتباس تم اقتباسها في عدد: 245 مقالات ذات صلة الإصدارات الـ 8كلها إصدار HTML‏

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Bellman eluder dimension: New rich classes of rl problems, and sample-efficient algorithms‏

C **, Q Liu, S Miryoosefi - Advances in neural information …, 2021‏ - proceedings.neurips.cc‏

Finding the minimal structural assumptions that empower sample-efficient learning is one of
the most important research directions in Reinforcement Learning (RL). This paper …‏

حفظ اقتباس تم اقتباسها في عدد: 266 مقالات ذات صلة الإصدارات الـ 11كلها إصدار HTML‏

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Provably efficient reinforcement learning with linear function approximation‏

C **, Z Yang, Z Wang… - Conference on learning …, 2020‏ - proceedings.mlr.press‏

Abstract Modern Reinforcement Learning (RL) is commonly applied to practical problems
with an enormous number of states, where\emph {function approximation} must be deployed …‏

حفظ اقتباس تم اقتباسها في عدد: 776 مقالات ذات صلة الإصدارات الـ 4كلها إصدار HTML‏

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Nearly minimax optimal reinforcement learning for linear mixture markov decision processes‏

D Zhou, Q Gu, C Szepesvari - Conference on Learning …, 2021‏ - proceedings.mlr.press‏

We study reinforcement learning (RL) with linear function approximation where the
underlying transition probability kernel of the Markov decision process (MDP) is a linear …‏

حفظ اقتباس تم اقتباسها في عدد: 246 مقالات ذات صلة الإصدارات الـ 7كلها إصدار HTML‏

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

When is partially observable reinforcement learning not scary?‏

Q Liu, A Chung, C Szepesvári… - Conference on Learning …, 2022‏ - proceedings.mlr.press‏

Partial observability is ubiquitous in applications of Reinforcement Learning (RL), in which
agents learn to make a sequence of decisions despite lacking complete information about …‏

حفظ اقتباس تم اقتباسها في عدد: 112 مقالات ذات صلة الإصدارات الـ 7كلها إصدار HTML‏

إنشاء تنبيه

اقتباس

بحث متقدم

تم حفظ المقالة في مكتبتي.

Contextual decision processes with low bellman rank are pac-learnable

Neural approaches to conversational AI‏

Bellman-consistent pessimism for offline reinforcement learning‏

Bridging offline reinforcement learning and imitation learning: A tale of pessimism‏

The statistical complexity of interactive decision making‏

Dive into deep learning‏

Bilinear classes: A structural framework for provable generalization in rl‏

Bellman eluder dimension: New rich classes of rl problems, and sample-efficient algorithms‏

Provably efficient reinforcement learning with linear function approximation‏

Nearly minimax optimal reinforcement learning for linear mixture markov decision processes‏

When is partially observable reinforcement learning not scary?‏