Google Academic

Neural thompson sampling

Turnitin 降AI改写早检测系统早降重系统 Turnitin-UK版万方检测-期刊版维普编辑部版 Grammarly检测 Paperpass检测 checkpass检测 PaperYY检测

[CARTE][B] Bayesian optimization

R Garnett - 2023 - books.google.com

Bayesian optimization is a methodology for optimizing expensive objective functions that
has proven success in the sciences, engineering, and beyond. This timely text provides a …

Salvați Citați Citat de 772 ori Articole cu conținut similar Toate cele 5 versiuni Căutare Bibliotecă

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Poem: Out-of-distribution detection with posterior sampling

Y Ming, Y Fan, Y Li - International Conference on Machine …, 2022 - proceedings.mlr.press

Abstract Out-of-distribution (OOD) detection is indispensable for machine learning models
deployed in the open world. Recently, the use of an auxiliary outlier dataset during training …

Salvați Citați Citat de 121 ori Articole cu conținut similar Toate cele 5 versiuni Afișare ca HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Randomized exploration in cooperative multi-agent reinforcement learning

HL Hsu, W Wang, M Pajic, P Xu - Advances in Neural …, 2025 - proceedings.neurips.cc

We present the first study on provably efficient randomized exploration in cooperative multi-
agent reinforcement learning (MARL). We propose a unified algorithm framework for …

Salvați Citați Citat de 6 ori Articole cu conținut similar Toate cele 4 versiuni Afișare ca HTML

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Langevin monte carlo for contextual bandits

P Xu, H Zheng, EV Mazumdar… - International …, 2022 - proceedings.mlr.press

We study the efficiency of Thompson sampling for contextual bandits. Existing Thompson
sampling-based algorithms need to construct a Laplace approximation (ie, a Gaussian …

Salvați Citați Citat de 41 ori Articole cu conținut similar Toate cele 6 versiuni Afișare ca HTML

[Free GPT-4]
[DeepSeek]

[PDF] mit.edu

[PDF][PDF] Use your instinct: Instruction optimization using neural bandits coupled with transformers

X Lin, Z Wu, Z Dai, W Hu, Y Shu, SK Ng, P Jaillet… - arxiv preprint arxiv …, 2023 - mit.edu

Large language models (LLMs) have shown remarkable instruction-following capabilities
and achieved impressive performances in various applications. However, the performances …

Salvați Citați Citat de 28 ori Articole cu conținut similar Toate cele 4 versiuni Afișare ca HTML

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Contextual bandits with large action spaces: Made practical

Y Zhu, DJ Foster, J Langford… - … Conference on Machine …, 2022 - proceedings.mlr.press

A central problem in sequential decision making is to develop algorithms that are practical
and computationally efficient, yet support the use of flexible, general-purpose models …

Salvați Citați Citat de 39 ori Articole cu conținut similar Toate cele 4 versiuni Afișare ca HTML

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Approximate thompson sampling via epistemic neural networks

I Osband, Z Wen, SM Asghari… - Uncertainty in …, 2023 - proceedings.mlr.press

Thompson sampling (TS) is a popular heuristic for action selection, but it requires sampling
from a posterior distribution. Unfortunately, this can become computationally intractable in …

Salvați Citați Citat de 27 ori Articole cu conținut similar Toate cele 7 versiuni Afișare ca HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Neural contextual bandits with deep representation and shallow exploration

P Xu, Z Wen, H Zhao, Q Gu - arxiv preprint arxiv:2012.01780, 2020 - arxiv.org

We study a general class of contextual bandits, where each context-action pair is associated
with a raw feature vector, but the reward generating function is unknown. We propose a …

Salvați Citați Citat de 80 ori Articole cu conținut similar Toate cele 5 versiuni Afișare ca HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Optimal order simple regret for Gaussian process bandits

S Vakili, N Bouziani, S Jalali… - Advances in Neural …, 2021 - proceedings.neurips.cc

Consider the sequential optimization of a continuous, possibly non-convex, and expensive
to evaluate objective function $ f $. The problem can be cast as a Gaussian Process (GP) …

Salvați Citați Citat de 54 ori Articole cu conținut similar Toate cele 7 versiuni Afișare ca HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Quantum bayesian optimization

Z Dai, GKR Lau, A Verma, Y Shu… - Advances in Neural …, 2023 - proceedings.neurips.cc

Kernelized bandits, also known as Bayesian optimization (BO), has been a prevalent
method for optimizing complicated black-box reward functions. Various BO algorithms have …

Salvați Citați Citat de 13 ori Articole cu conținut similar Toate cele 10 versiuni Afișare ca HTML

Creează alerta

Citați

Căutare avansată

Salvat în Bibliotecă

Neural thompson sampling

[CARTE][B] Bayesian optimization

Poem: Out-of-distribution detection with posterior sampling

Randomized exploration in cooperative multi-agent reinforcement learning

Langevin monte carlo for contextual bandits

[PDF][PDF] Use your instinct: Instruction optimization using neural bandits coupled with transformers

Contextual bandits with large action spaces: Made practical

Approximate thompson sampling via epistemic neural networks

Neural contextual bandits with deep representation and shallow exploration

Optimal order simple regret for Gaussian process bandits

Quantum bayesian optimization