الباحث العلمي من Google

J Hong, B Kveton, M Zaheer… - International …, 2022‏ - proceedings.mlr.press‏

Abstract Meta-, multi-task, and federated learning can be all viewed as solving similar tasks,
drawn from a distribution that reflects task similarities. We provide a unified view of all these …‏

حفظ اقتباس تم اقتباسها في عدد: 45 مقالات ذات صلة الإصدارات الـ 4كلها إصدار HTML‏

[Free GPT-4]
[DeepSeek]

[PDF] aeaweb.org

[PDF][PDF] Adaptivity and confounding in multi-armed bandit experiments‏

C Qin, D Russo - arxiv preprint arxiv:2202.09036, 2022‏ - aeaweb.org‏

We explore a new model of bandit experiments where a potentially nonstationary sequence
of contexts influences arms' performance. Context-unaware algorithms risk confounding …‏

حفظ اقتباس تم اقتباسها في عدد: 39 مقالات ذات صلة الإصدارات الـ 3كلها إصدار HTML‏

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Mixed-effect thompson sampling‏

I Aouali, B Kveton, S Katariya - International Conference on …, 2023‏ - proceedings.mlr.press‏

A contextual bandit is a popular framework for online learning to act under uncertainty. In
practice, the number of actions is huge and their expected rewards are correlated. In this …‏

حفظ اقتباس تم اقتباسها في عدد: 16 مقالات ذات صلة الإصدارات الـ 8كلها إصدار HTML‏

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Multi-task off-policy learning from bandit feedback‏

J Hong, B Kveton, M Zaheer… - International …, 2023‏ - proceedings.mlr.press‏

Many practical problems involve solving similar tasks. In recommender systems, the tasks
can be users with similar preferences; in search engines, the tasks can be items with similar …‏

حفظ اقتباس تم اقتباسها في عدد: 8 مقالات ذات صلة الإصدارات الـ 8كلها إصدار HTML‏

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Transportability for bandits with data from different environments‏

A Bellot, A Malek, S Chiappa - Advances in Neural …, 2023‏ - proceedings.neurips.cc‏

A unifying theme in the design of intelligent agents is to efficiently optimize a policy based on
what prior knowledge of the problem is available and what actions can be taken to learn …‏

حفظ اقتباس تم اقتباسها في عدد: 2 مقالات ذات صلة الإصدارات الـ 3كلها إصدار HTML‏

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Lifelong bandit optimization: no prior and no regret‏

F Schur, P Kassraie, J Rothfuss… - Uncertainty in Artificial …, 2023‏ - proceedings.mlr.press‏

Abstract Machine learning algorithms are often repeatedly. applied to problems with similar
structure over and over again. We focus on solving a sequence of bandit optimization tasks …‏

حفظ اقتباس تم اقتباسها في عدد: 6 مقالات ذات صلة الإصدارات الـ 10كلها إصدار HTML‏

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Meta Learning in Bandits within shared affine Subspaces‏

S Bilaj, S Dhouib, S Maghsudi - International Conference on …, 2024‏ - proceedings.mlr.press‏

We study the problem of meta-learning several contextual stochastic bandits tasks by
leveraging their concentration around a low dimensional affine subspace, which we learn …‏

حفظ اقتباس مقالات ذات صلة الإصدارات الـ 6كلها إصدار HTML‏

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Thompson sampling with diffusion generative prior‏

YG Hsieh, SP Kasiviswanathan, B Kveton… - arxiv preprint arxiv …, 2023‏ - arxiv.org‏

In this work, we initiate the idea of using denoising diffusion models to learn priors for online
decision making problems. Our special focus is on the meta-learning for bandit framework …‏

حفظ اقتباس تم اقتباسها في عدد: 6 مقالات ذات صلة الإصدارات الـ 7كلها إصدار HTML‏

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Thompson sampling for robust transfer in multi-task bandits‏

Z Wang, C Zhang, K Chaudhuri - arxiv preprint arxiv:2206.08556, 2022‏ - arxiv.org‏

We study the problem of online multi-task learning where the tasks are performed within
similar but not necessarily identical multi-armed bandit environments. In particular, we study …‏

حفظ اقتباس تم اقتباسها في عدد: 7 مقالات ذات صلة الإصدارات الـ 10كلها إصدار HTML‏

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Prior-dependent allocations for bayesian fixed-budget best-arm identification in structured bandits‏

N Nguyen, I Aouali, A György, C Vernade - arxiv preprint arxiv …, 2024‏ - arxiv.org‏

We study the problem of Bayesian fixed-budget best-arm identification (BAI) in structured
bandits. We propose an algorithm that uses fixed allocations based on the prior information …‏

حفظ اقتباس تم اقتباسها في عدد: 2 مقالات ذات صلة الإصدارات الـ 11كلها إصدار HTML‏

إنشاء تنبيه

اقتباس

بحث متقدم

تم حفظ المقالة في مكتبتي.

Metalearning linear bandits by prior update

Hierarchical bayesian bandits‏

[PDF][PDF] Adaptivity and confounding in multi-armed bandit experiments‏

Mixed-effect thompson sampling‏

Multi-task off-policy learning from bandit feedback‏

Transportability for bandits with data from different environments‏

Lifelong bandit optimization: no prior and no regret‏

Meta Learning in Bandits within shared affine Subspaces‏

Thompson sampling with diffusion generative prior‏

Thompson sampling for robust transfer in multi-task bandits‏

Prior-dependent allocations for bayesian fixed-budget best-arm identification in structured bandits‏