Google Академик

[Књига][B] Bandit algorithms

T Lattimore, C Szepesvári - 2020 - books.google.com

Decision-making in the face of uncertainty is a significant challenge in machine learning,
and the multi-armed bandit model is a commonly used framework to address it. This …

Сачувај Цитирај 3356 пута наведен Сродни чланци Све верзије (9) Претрага библиотека

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Linear thompson sampling revisited

M Abeille, A Lazaric - Artificial Intelligence and Statistics, 2017 - proceedings.mlr.press

We derive an alternative proof for the regret of Thompson sampling (TS) in the stochastic
linear bandit setting. While we obtain a regret bound of order $ O (d^ 3/2\sqrtT) $ as in …

Сачувај Цитирај 305 пута наведен Сродни чланци Све верзије (18) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Tight regret bounds for stochastic combinatorial semi-bandits

B Kveton, Z Wen, A Ashkan… - Artificial Intelligence …, 2015 - proceedings.mlr.press

A stochastic combinatorial semi-bandit is an online learning problem where at each step a
learning agent chooses a subset of ground items subject to constraints, and then observes …

Сачувај Цитирај 356 пута наведен Сродни чланци Све верзије (13) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Cascading bandits: Learning to rank in the cascade model

B Kveton, C Szepesvari, Z Wen… - … conference on machine …, 2015 - proceedings.mlr.press

A search engine usually outputs a list of K web pages. The user examines this list, from the
first web page to the last, and chooses the first attractive page. This model of user behavior …

Сачувај Цитирај 336 пута наведен Сродни чланци Све верзије (13) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] sciencedirect.com

Bandit algorithms: A comprehensive review and their dynamic selection from a portfolio for multicriteria top-k recommendation

A Letard, N Gutowski, O Camp, T Amghar - Expert Systems with …, 2024 - Elsevier

This paper discusses the use of portfolio approaches based on bandit algorithms to optimize
multicriteria decision-making in recommender systems (accuracy and diversity). While …

Сачувај Цитирај 5 пута наведен Сродни чланци Све верзије (5)

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Thompson sampling for combinatorial semi-bandits

S Wang, W Chen - International Conference on Machine …, 2018 - proceedings.mlr.press

We study the application of the Thompson sampling (TS) methodology to the stochastic
combinatorial multi-armed bandit (CMAB) framework. We analyze the standard TS algorithm …

Сачувај Цитирај 160 пута наведен Сродни чланци Све верзије (5) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Combinatorial bandits revisited

R Combes… - Advances in neural …, 2015 - proceedings.neurips.cc

This paper investigates stochastic and adversarial combinatorial multi-armed bandit
problems. In the stochastic setting under semi-bandit feedback, we derive a problem-specific …

Сачувај Цитирај 293 пута наведен Сродни чланци Све верзије (21) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Minimal exploration in structured stochastic bandits

R Combes, S Magureanu… - Advances in Neural …, 2017 - proceedings.neurips.cc

This paper introduces and addresses a wide class of stochastic bandit problems where the
function map** the arm to the corresponding reward exhibits some known structural …

Сачувај Цитирај 141 пута наведен Сродни чланци Све верзије (17) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Online influence maximization under independent cascade model with semi-bandit feedback

Z Wen, B Kveton, M Valko… - Advances in neural …, 2017 - proceedings.neurips.cc

We study the online influence maximization problem in social networks under the
independent cascade model. Specifically, we aim to learn the set of" best influencers" in a …

Сачувај Цитирај 152 пута наведен Сродни чланци Све верзије (21) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Cascading bandits for large-scale recommendation problems

S Zong, H Ni, K Sung, NR Ke, Z Wen… - arxiv preprint arxiv …, 2016 - arxiv.org

Most recommender systems recommend a list of items. The user examines the list, from the
first item to the last, and often chooses the first attractive item and does not examine the rest …

Сачувај Цитирај 140 пута наведен Сродни чланци Све верзије (11) HTML верзија

Направи обавештење

Цитирај

Напредна претрага

Сачувано у мојој библиотеци

Efficient learning in large-scale combinatorial semi-bandits

[Књига][B] Bandit algorithms

Linear thompson sampling revisited

Tight regret bounds for stochastic combinatorial semi-bandits

Cascading bandits: Learning to rank in the cascade model

Bandit algorithms: A comprehensive review and their dynamic selection from a portfolio for multicriteria top-k recommendation

Thompson sampling for combinatorial semi-bandits

Combinatorial bandits revisited

Minimal exploration in structured stochastic bandits

Online influence maximization under independent cascade model with semi-bandit feedback

Cascading bandits for large-scale recommendation problems