- Academic Search

AM Metelli, F Trovo, M Pirola… - … Conference on Machine …, 2022 - proceedings.mlr.press

This paper is in the field of stochastic Multi-Armed Bandits (MABs), ie, those sequential
selection techniques able to learn online using only the feedback given by the chosen …

Speichern Zitieren Zitiert von: 27 Ähnliche Artikel Alle 7 Versionen HTML-Version

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Which LLM to Play? Convergence-Aware Online Model Selection with Time-Increasing Bandits

Y **a, F Kong, T Yu, L Guo, RA Rossi, S Kim… - Proceedings of the ACM …, 2024 - dl.acm.org

Web-based applications such as chatbots, search engines and news recommendations
continue to grow in scale and complexity with the recent surge in the adoption of large …

Speichern Zitieren Zitiert von: 7 Ähnliche Artikel Alle 5 Versionen

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Model-Based Best Arm Identification for Decreasing Bandits

S Takemori, Y Umeda… - … Conference on Artificial …, 2024 - proceedings.mlr.press

We study the problem of reliably identifying the best (lowest loss) arm in a stochastic multi-
armed bandit when the expected loss of each arm is monotone decreasing as a function of …

Speichern Zitieren Zitiert von: 1 Ähnliche Artikel HTML-Version

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Best Arm Identification for Stochastic Rising Bandits

M Mussi, A Montenegro, F Trovó, M Restelli… - arxiv preprint arxiv …, 2023 - arxiv.org

Stochastic Rising Bandits (SRBs) model sequential decision-making problems in which the
expected rewards of the available options increase every time they are selected. This setting …

Speichern Zitieren Zitiert von: 7 Ähnliche Artikel Alle 6 Versionen HTML-Version

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Budgeted Online Model Selection and Fine-Tuning via Federated Learning

PM Ghari, Y Shen - arxiv preprint arxiv:2401.10478, 2024 - arxiv.org

Online model selection involves selecting a model from a set of candidate models' on the
fly'to perform prediction on a stream of data. The choice of candidate models henceforth has …

Speichern Zitieren Zitiert von: 3 Ähnliche Artikel Alle 3 Versionen HTML-Version

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Rising Rested Bandits: Lower Bounds and Efficient Algorithms

M Fiandri, AM Metelli, F Trovo - arxiv preprint arxiv:2411.14446, 2024 - arxiv.org

This paper is in the field of stochastic Multi-Armed Bandits (MABs), ie those sequential
selection techniques able to learn online using only the feedback given by the chosen …

Speichern Zitieren Ähnliche Artikel Alle 2 Versionen HTML-Version

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Rising Rested MAB with Linear Drift

O Amichay, Y Mansour - arxiv preprint arxiv:2501.04403, 2025 - arxiv.org

We consider non-stationary multi-arm bandit (MAB) where the expected reward of each
action follows a linear function of the number of times we executed the action. Our main …

Speichern Zitieren Ähnliche Artikel Alle 2 Versionen HTML-Version

[Free GPT-4]
[DeepSeek]

[PDF] openreview.net

Convergence-Aware Online Model Selection with Time-Increasing Bandits

Y **a, F Kong, T Yu, L Guo, RA Rossi, S Kim… - The Web Conference … - openreview.net

Web-based applications such as chatbots, search engines and news recommendations
continue to grow in scale and complexity with the recent surge in the adoption of large …

Speichern Zitieren Zitiert von: 1 Ähnliche Artikel HTML-Version

Scalable Online Decision Making: Algorithm Design and Fundamental Limits

PM Ghari - 2024 - search.proquest.com

Decision-making and real-time prediction in non-stationary and dynamic environments
present significant challenges for the application of machine learning and artificial …

Speichern Zitieren Ähnliche Artikel

[Free GPT-4]
[DeepSeek]

[PDF] escholarship.org

Scalable Online Decision Making: Algorithm Design and Fundamental Limits

P Mollaebrahim Ghari - 2024 - escholarship.org

Decision-making and real-time prediction in non-stationary and dynamic environments
present significant challenges for the application of machine learning and artificial …

Speichern Zitieren Ähnliche Artikel HTML-Version

Alert erstellen

Zitieren

Erweiterte Suche

In „Meine Bibliothek“ gespeichert

Best model identification: A rested bandit formulation

Stochastic rising bandits

Which LLM to Play? Convergence-Aware Online Model Selection with Time-Increasing Bandits

Model-Based Best Arm Identification for Decreasing Bandits

Best Arm Identification for Stochastic Rising Bandits

Budgeted Online Model Selection and Fine-Tuning via Federated Learning

Rising Rested Bandits: Lower Bounds and Efficient Algorithms

Rising Rested MAB with Linear Drift

Convergence-Aware Online Model Selection with Time-Increasing Bandits

Scalable Online Decision Making: Algorithm Design and Fundamental Limits

Scalable Online Decision Making: Algorithm Design and Fundamental Limits