Google 학술 검색

T Lattimore, C Szepesvári - 2020 - books.google.com

Decision-making in the face of uncertainty is a significant challenge in machine learning,
and the multi-armed bandit model is a commonly used framework to address it. This …

저장 인용 3281회 인용 관련 학술자료 전체 9개의 버전 도서관 검색

[Free GPT-4]

[PDF] nowpublishers.com

Introduction to multi-armed bandits

A Slivkins - Foundations and Trends® in Machine Learning, 2019 - nowpublishers.com

Multi-armed bandits a simple but very powerful framework for algorithms that make
decisions over time under uncertainty. An enormous body of work has accumulated over the …

[Free GPT-4]

[PDF] arxiv.org

Reinforcement learning to rank in e-commerce search engine: Formalization, analysis, and application

Y Hu, Q Da, A Zeng, Y Yu, Y Xu - Proceedings of the 24th ACM SIGKDD …, 2018 - dl.acm.org

In E-commerce platforms such as Amazon and TaoBao, ranking items in a search session is
a typical multi-step decision-making problem. Learning to rank (LTR) methods have been …

저장 인용 217회 인용 관련 학술자료 전체 7개의 버전

[Free GPT-4]

[PDF] mlr.press

Thompson sampling for combinatorial semi-bandits

S Wang, W Chen - International Conference on Machine …, 2018 - proceedings.mlr.press

We study the application of the Thompson sampling (TS) methodology to the stochastic
combinatorial multi-armed bandit (CMAB) framework. We analyze the standard TS algorithm …

저장 인용 158회 인용 관련 학술자료 전체 5개의 버전 HTML 버전

[Free GPT-4]

[PDF] neurips.cc

Combinatorial multi-armed bandit with general reward functions

W Chen, W Hu, F Li, J Li, Y Liu… - Advances in Neural …, 2016 - proceedings.neurips.cc

In this paper, we study the stochastic combinatorial multi-armed bandit (CMAB) framework
that allows a general nonlinear reward function, whose expected value may not depend only …

[Free GPT-4]

[PDF] mlr.press

Contextual combinatorial cascading bandits

S Li, B Wang, S Zhang, W Chen - … conference on machine …, 2016 - proceedings.mlr.press

We propose the contextual combinatorial cascading bandits, a combinatorial online learning
game, where at each time step a learning agent is given a set of contextual information, then …

[Free GPT-4]

[PDF] neurips.cc

Online influence maximization under independent cascade model with semi-bandit feedback

Z Wen, B Kveton, M Valko… - Advances in neural …, 2017 - proceedings.neurips.cc

We study the online influence maximization problem in social networks under the
independent cascade model. Specifically, we aim to learn the set of" best influencers" in a …

[Free GPT-4]

[PDF] arxiv.org

Cascading bandits for large-scale recommendation problems

S Zong, H Ni, K Sung, NR Ke, Z Wen… - arxiv preprint arxiv …, 2016 - arxiv.org

Most recommender systems recommend a list of items. The user examines the list, from the
first item to the last, and often chooses the first attractive item and does not examine the rest …

[Free GPT-4]

[PDF] mlr.press

Online learning to rank in stochastic click models

M Zoghi, T Tunys, M Ghavamzadeh… - International …, 2017 - proceedings.mlr.press

Online learning to rank is a core problem in information retrieval and machine learning.
Many provably efficient algorithms have been recently proposed for this problem in specific …

[Free GPT-4]

[PDF] mlr.press

Contextual combinatorial bandits with probabilistically triggered arms

X Liu, J Zuo, S Wang, JCS Lui… - International …, 2023 - proceedings.mlr.press

We study contextual combinatorial bandits with probabilistically triggered arms (C $^ 2$
MAB-T) under a variety of smoothness conditions that capture a wide range of applications …

저장 인용 18회 인용 관련 학술자료 전체 8개의 버전 HTML 버전

알림 만들기

인용

고급 검색

라이브러리에 저장됨

Combinatorial cascading bandits

[책][B] Bandit algorithms

Introduction to multi-armed bandits

Reinforcement learning to rank in e-commerce search engine: Formalization, analysis, and application

Thompson sampling for combinatorial semi-bandits

Combinatorial multi-armed bandit with general reward functions

Contextual combinatorial cascading bandits

Online influence maximization under independent cascade model with semi-bandit feedback

Cascading bandits for large-scale recommendation problems

Online learning to rank in stochastic click models

Contextual combinatorial bandits with probabilistically triggered arms